
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find Rows in R Data Frame Without Missing Values
Dealing with missing values is one of the most critical task in data analysis. If we have a large amount of data then it is better to remove the rows that contains missing values. For the removal of such rows we can use complete.cases function.
For example, if we have a data frame called df that contains some missing values then we can remove the rows with missing values using the below given command −
df[complete.cases(df),]
Example 1
Following snippet creates a sample data frame −
x1<-sample(c(NA,rpois(2,5)),20,replace=TRUE) x2<-sample(c(NA,rpois(2,5)),20,replace=TRUE) x3<-sample(c(NA,rpois(2,5)),20,replace=TRUE) df1<-data.frame(x1,x2,x3) df1
The following dataframe is created −
x1 x2 x3 1 NA 7 3 2 4 NA 3 3 4 7 NA 4 2 4 NA 5 2 NA 4 6 2 7 NA 7 NA 4 4 8 NA NA 4 9 2 NA NA 10 NA NA 4 11 4 7 3 12 4 NA 4 13 NA 7 3 14 NA 7 4 15 NA 7 NA 16 2 NA 4 17 2 4 3 18 4 7 3 19 2 NA 3 20 4 4 NA
To remove the rows of df1 with missing values, add the following code to the above snippet −
x1<-sample(c(NA,rpois(2,5)),20,replace=TRUE) x2<-sample(c(NA,rpois(2,5)),20,replace=TRUE) x3<-sample(c(NA,rpois(2,5)),20,replace=TRUE) df1<-data.frame(x1,x2,x3) df1[complete.cases(df1),]
Output
If you execute all the above given snippets as a single program, it generates the following output −
x1 x2 x3 11 4 7 3 17 2 4 3 18 4 7 3
Example 2
Following snippet creates a sample data frame −
y1<-sample(c(NA,rnorm(2)),20,replace=TRUE) y2<-sample(c(NA,rnorm(2)),20,replace=TRUE) y3<-sample(c(NA,rnorm(2)),20,replace=TRUE) df2<-data.frame(y1,y2,y3) df2
The following dataframe is created −
y1 y2 y3 1 -0.2619255 -0.80309246 -0.76031065 2 -0.2619255 -0.04079919 -0.76031065 3 1.7217166 NA -0.76031065 4 -0.2619255 NA NA 5 NA -0.04079919 -0.76031065 6 1.7217166 NA 0.01337776 7 NA -0.80309246 NA 8 NA NA -0.76031065 9 1.7217166 -0.04079919 NA 10 NA -0.04079919 0.01337776 11 1.7217166 -0.80309246 0.01337776 12 -0.2619255 NA -0.76031065 13 NA -0.04079919 0.01337776 14 -0.2619255 NA 0.01337776 15 -0.2619255 -0.04079919 NA 16 NA -0.04079919 NA 17 -0.2619255 NA -0.76031065 18 1.7217166 -0.80309246 0.01337776 19 NA -0.80309246 -0.76031065 20 NA -0.04079919 NA
To remove the rows of df2 with missing values, add the following code to the above snippet −
y1<-sample(c(NA,rnorm(2)),20,replace=TRUE) y2<-sample(c(NA,rnorm(2)),20,replace=TRUE) y3<-sample(c(NA,rnorm(2)),20,replace=TRUE) df2<-data.frame(y1,y2,y3) df2[complete.cases(df2),]
Output
If you execute all the above given snippets as a single program, it generates the following output −
y1 y2 y3 1 -0.2619255 -0.80309246 -0.76031065 2 -0.2619255 -0.04079919 -0.76031065 11 1.7217166 -0.80309246 0.01337776 18 1.7217166 -0.80309246 0.01337776