
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Remove Rows from a Data Frame that Exists in Another Data Frame in R
To remove rows from a data frame that exists in another data frame, we can use subsetting with single square brackets. This removal will help us to find the unique rows in the data frame based on the column of another data frame.
Check out the below Examples to understand how it can be done.
Example 1
Following snippet creates a sample data frame −
x<-rpois(20,2) y<-rpois(20,2) df1<-data.frame(x,y) df1
The following dataframe is created
x y 1 2 0 2 1 2 3 1 3 4 2 1 5 0 0 6 3 3 7 1 3 8 0 2 9 3 2 10 2 0 11 2 1 12 1 6 13 1 2 14 2 1 15 4 5 16 2 2 17 1 4 18 0 1 19 0 1 20 2 2
Add the following code to the above snippet −
x<-rpois(20,2) y<-rpois(20,2) df1<-data.frame(x,y) a<-rpois(20,5) b<-rpois(20,5) df2<-data.frame(a,b) df2
The following dataframe is created
a b 1 4 0 2 3 6 3 4 6 4 1 3 5 5 3 6 5 7 7 5 2 8 4 6 9 4 6 10 4 3 11 3 6 12 4 4 13 4 2 14 5 2 15 4 3 16 3 7 17 4 6 18 5 3 19 3 3 20 9 3
To remove rows in df1 based on column x that do not exist in column a of df2 on the above created data frame, add the following code to the above snippet −
x<-rpois(20,2) y<-rpois(20,2) df1<-data.frame(x,y) a<-rpois(20,5) b<-rpois(20,5) df2<-data.frame(a,b) df1[!df1$x %in% df2$a,]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
x y 1 2 0 4 2 1 5 0 0 8 0 2 10 2 0 11 2 1 14 2 1 16 2 2 18 0 1 19 0 1 20 2 2
Example 2
Following snippet creates a sample data frame −
Grp<-sample(LETTERS[1:5],20,replace=TRUE) Rate<-rpois(20,5) df_grp<-data.frame(Grp,Rate) df_grp
The following dataframe is created
Grp Rate 1 D 6 2 D 3 3 E 7 4 D 6 5 B 6 6 D 3 7 D 3 8 A 3 9 C 2 10 A 4 11 A 7 12 C 7 13 C 5 14 E 7 15 B 7 16 C 6 17 B 6 18 A 4 19 C 6 20 B 1
Add the following code to the above snippet −
Grp<-sample(LETTERS[1:5],20,replace=TRUE) Rate<-rpois(20,5) df_grp<-data.frame(Grp,Rate) Category<-sample(LETTERS[3:7],20,replace=TRUE) Sales<-rpois(20,10) df_Sales<-data.frame(Category,Sales) df_Sales
The following dataframe is created
Category Sales 1 E 12 2 C 11 3 D 9 4 E 13 5 G 5 6 C 9 7 D 14 8 D 11 9 D 8 10 F 11 11 F 17 12 G 15 13 F 12 14 D 9 15 G 13 16 C 9 17 C 12 18 F 7 19 E 7 20 C 8
To remove rows in df_grp based on column Grp that do not exist in column Category of df_Sales on the above created data frame, add the following code to the above snippet −
Grp<-sample(LETTERS[1:5],20,replace=TRUE) Rate<-rpois(20,5) df_grp<-data.frame(Grp,Rate) Category<-sample(LETTERS[3:7],20,replace=TRUE) Sales<-rpois(20,10) df_Sales<-data.frame(Category,Sales) df_grp[!df_grp$Grp %in% df_Sales$Category,]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Grp Rate 5 B 6 8 A 3 10 A 4 11 A 7 15 B 7 17 B 6 18 A 4 20 B 1