Remove Duplicate Rows in R Data Frame Based on Two Columns

R Programming Server Side Programming Programming

If two values are repeated in a column that means there are many same values in that column but if those values are repeated in column as well as rows then they are called duplicated rows in two columns. To remove duplicate rows in an R data frame if exists in two columns, we can use duplicated function as shown in the below examples.

Consider the below data frame −

Example

Live Demo

x1<-sample(LETTERS[1:4],20,replace=TRUE)
x2<-sample(LETTERS[1:4],20,replace=TRUE)
df1<-data.frame(x1,x2)
df1

Output

   x1  x2
1  B   B
2  C   D
3  A   A
4  C   D
5  B   C
6  D   D
7  D   A
8  A   B
9  B   A
10 D   B
11 A   B
12 B   B
13 D   A
14 A   C
15 C   A
16 A   B
17 A   B
18 A   C
19 D   A
20 B   B

Removing duplicate rows if exists in two columns of df1 −

Example

df1[!duplicated(df1[c("x1","x2")]),]

Output

   x1 x2
1  B  B
2  C  D
3  A  A
5  B  C
6  D  D
7  D  A
8  A  B
9  B  A
10 D  B
14 A  C
15 C  A

Example

Live Demo

y1<-rpois(20,1)
y2<-rpois(20,1)
y3<-rpois(20,1)
df2<-data.frame(y1,y2,y3)
df2

Output

Removing duplicate rows if exists in two columns of df2 −

Example

df2[!duplicated(df2[c("y1","y2")]),]

Output

Nizamuddin Siddiqui

Updated on: 2021-02-08T06:21:56+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started