
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find Mean of Multiple Columns Based on Grouping Columns in R Data Frame
To find the mean of multiple columns based on multiple grouping columns in R data frame, we can use summarise_at function with mean function.
For Example, if we have a data frame called df that contains two grouping columns say G1 and G2 and two numerical columns say Num1 and Num2 then we can find the mean of Num1 and Num2 based on G1 and G2 by using the below mentioned command −
df%%group_by(G1,G2)%%summarise_at(vars("Num1","Num2"),mean)
Example 1
Following snippet creates a sample data frame −
Gender<-sample(c("Male","Female"),20,replace=TRUE) Class<-sample(c("First","Second","Third"),20,replace=TRUE) Score1<-sample(1:10,20,replace=TRUE) Score2<-sample(1:10,20,replace=TRUE) Score3<-sample(1:10,20,replace=TRUE) df1<-data.frame(Gender,Class,Score1,Score2,Score3) df1
The following dataframe is created
Gender Class Score1 Score2 Score3 1 Female Second 10 9 10 2 Male First 4 8 3 3 Male First 10 6 10 4 Male First 3 6 3 5 Male Second 1 2 8 6 Female Second 9 7 7 7 Female First 5 3 3 8 Male Third 4 4 5 9 Female Third 8 10 2 10 Male First 3 4 10 11 Female Third 9 5 10 12 Male Second 1 8 4 13 Female First 5 3 1 14 Male Second 2 9 10 15 Female Third 8 8 10 16 Female Second 10 1 3 17 Female Second 8 5 4 18 Female First 2 1 2 19 Male Third 3 1 8 20 Female Second 6 5 7
To load dplyr package and find the mean of Score columns based on Gender and Class on the above created data frame, add the following code to the above snippet −
Gender<-sample(c("Male","Female"),20,replace=TRUE) Class<-sample(c("First","Second","Third"),20,replace=TRUE) Score1<-sample(1:10,20,replace=TRUE) Score2<-sample(1:10,20,replace=TRUE) Score3<-sample(1:10,20,replace=TRUE) df1<-data.frame(Gender,Class,Score1,Score2,Score3) library(dplyr) df1%%group_by(Gender,Class)%%summarise_at(vars("Score1","Score2","Score3"),mean) # A tibble: 6 x 5 # Groups: Gender [2]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Gender Class Score1 Score2 Score3 <chr <chr <dbl <dbl <dbl 1 Female First 4 2.33 2 2 Female Second 8.6 5.4 6.2 3 Female Third 8.33 7.67 7.33 4 Male First 5 6 6.5 5 Male Second 1.33 6.33 7.33 6 Male Third 3.5 2.5 6.5
Example 2
Following snippet creates a sample data frame −
Group1<-sample(LETTERS[1:4],20,replace=TRUE) Group2<-sample(letters[1:4],20,replace=TRUE) x1<-sample(1:100,20) x2<-sample(1:100,20) x3<-sample(1:100,20) df2<-data.frame(Group1,Group2,x1,x2,x3) df2
The following dataframe is created
Group1 Group2 x1 x2 x3 1 B c 90 19 95 2 D b 98 90 9 3 D b 14 67 96 4 B d 91 52 98 5 A b 27 83 30 6 A a 29 95 27 7 D d 28 69 80 8 C b 58 72 42 9 B c 41 99 1 10 A a 62 20 49 11 B c 47 87 67 12 C c 71 58 43 13 A d 23 6 89 14 B a 39 13 15 15 D c 22 7 23 16 D c 72 1 61 17 D c 21 55 6 18 B d 48 63 41 19 B a 69 12 18 20 A b 88 86 20
To find the mean of x columns based on Group1 and Group2 on the above created data frame, add the following code to the above snippet −
Group1<-sample(LETTERS[1:4],20,replace=TRUE) Group2<-sample(letters[1:4],20,replace=TRUE) x1<-sample(1:100,20) x2<-sample(1:100,20) x3<-sample(1:100,20) df2<-data.frame(Group1,Group2,x1,x2,x3) df2%%group_by(Group1,Group2)%%summarise_at(vars("x1","x2","x3"),mean) # A tibble: 11 x 5 # Groups: Group1 [4]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Group1 Group2 x1 x2 x3 <chr <chr <dbl <dbl <dbl 1 A a 45.5 57.5 38 2 A b 57.5 84.5 25 3 A d 23 6 89 4 B a 54 12.5 16.5 5 B c 59.3 68.3 54.3 6 B d 69.5 57.5 69.5 7 C b 58 72 42 8 C c 71 58 43 9 D b 56 78.5 52.5 10 D c 38.3 21 30 11 D d 28 69 80