Find the mean of multiple columns based on multiple grouping columns in R data frame.

R Programming Server Side Programming Programming

To find the mean of multiple columns based on multiple grouping columns in R data frame, we can use summarise_at function with mean function.

For Example, if we have a data frame called df that contains two grouping columns say G1 and G2 and two numerical columns say Num1 and Num2 then we can find the mean of Num1 and Num2 based on G1 and G2 by using the below mentioned command −

df%%group_by(G1,G2)%%summarise_at(vars("Num1","Num2"),mean)

Example 1

Following snippet creates a sample data frame −

Gender<-sample(c("Male","Female"),20,replace=TRUE)
Class<-sample(c("First","Second","Third"),20,replace=TRUE)
Score1<-sample(1:10,20,replace=TRUE)
Score2<-sample(1:10,20,replace=TRUE)
Score3<-sample(1:10,20,replace=TRUE)
df1<-data.frame(Gender,Class,Score1,Score2,Score3)
df1

The following dataframe is created

   Gender Class  Score1 Score2 Score3
 1 Female Second    10     9    10
 2 Male   First      4     8     3
 3 Male   First     10     6    10
 4 Male   First      3     6     3
 5 Male   Second     1     2     8
 6 Female Second     9     7     7
 7 Female First      5     3     3
 8 Male   Third      4     4     5
 9 Female Third      8    10     2
10 Male   First      3     4    10
11 Female Third      9     5    10
12 Male   Second     1     8     4
13 Female First      5     3     1
14 Male   Second     2     9    10
15 Female Third      8     8    10
16 Female Second    10     1     3
17 Female Second     8     5     4
18 Female First      2     1     2
19 Male   Third      3     1     8
20 Female Second     6     5     7

To load dplyr package and find the mean of Score columns based on Gender and Class on the above created data frame, add the following code to the above snippet −

Gender<-sample(c("Male","Female"),20,replace=TRUE)
Class<-sample(c("First","Second","Third"),20,replace=TRUE)
Score1<-sample(1:10,20,replace=TRUE)
Score2<-sample(1:10,20,replace=TRUE)
Score3<-sample(1:10,20,replace=TRUE)
df1<-data.frame(Gender,Class,Score1,Score2,Score3)
library(dplyr)
df1%%group_by(Gender,Class)%%summarise_at(vars("Score1","Score2","Score3"),mean)
# A tibble: 6 x 5
# Groups: Gender [2]

Output

If you execute all the above given snippets as a single program, it generates the following Output −

  Gender Class Score1 Score2 Score3
   <chr  <chr  <dbl   <dbl   <dbl
1 Female First  4     2.33    2
2 Female Second 8.6   5.4     6.2
3 Female Third  8.33  7.67    7.33
4 Male   First  5 6   6.5
5 Male   Second 1.33  6.33    7.33
6 Male   Third  3.5   2.5     6.5

Example 2

Following snippet creates a sample data frame −

Group1<-sample(LETTERS[1:4],20,replace=TRUE)
Group2<-sample(letters[1:4],20,replace=TRUE)
x1<-sample(1:100,20)
x2<-sample(1:100,20)
x3<-sample(1:100,20)
df2<-data.frame(Group1,Group2,x1,x2,x3)
df2

The following dataframe is created

  Group1 Group2 x1 x2 x3
 1 B     c      90 19 95
 2 D     b      98 90 9
 3 D     b      14 67 96
 4 B     d      91 52 98
 5 A     b      27 83 30
 6 A     a      29 95 27
 7 D     d      28 69 80
 8 C     b      58 72 42
 9 B     c      41 99 1
10 A     a      62 20 49
11 B     c      47 87 67
12 C     c      71 58 43
13 A     d      23 6 89
14 B     a      39 13 15
15 D     c      22 7 23
16 D     c      72 1 61
17 D     c      21 55 6
18 B     d      48 63 41
19 B     a      69 12 18
20 A     b      88 86 20

To find the mean of x columns based on Group1 and Group2 on the above created data frame, add the following code to the above snippet −

Group1<-sample(LETTERS[1:4],20,replace=TRUE)
Group2<-sample(letters[1:4],20,replace=TRUE)
x1<-sample(1:100,20)
x2<-sample(1:100,20)
x3<-sample(1:100,20)
df2<-data.frame(Group1,Group2,x1,x2,x3)
df2%%group_by(Group1,Group2)%%summarise_at(vars("x1","x2","x3"),mean)
# A tibble: 11 x 5
# Groups: Group1 [4]

Output

If you execute all the above given snippets as a single program, it generates the following Output −

 Group1 Group2 x1  x2    x3
 <chr  <chr  <dbl <dbl  <dbl
 1 A    a     45.5 57.5  38
 2 A    b     57.5 84.5  25
 3 A    d     23   6     89
 4 B    a     54   12.5  16.5
 5 B    c     59.3 68.3  54.3
 6 B    d     69.5 57.5  69.5
 7 C    b     58   72    42
 8 C    c     71   58    43
 9 D    b     56   78.5  52.5
10 D    c     38.3 21    30
11 D    d     28   69    80

Nizamuddin Siddiqui

Updated on: 2021-11-09T07:38:44+05:30

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started