
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Standardize Data Table Object Column by Group in R
To standardize data.table object column by group, we can use scale function and provide the grouping column with by function.
For Example, if we have a data.table object called DT that contains two columns say G and Num where G is a grouping column and Num is a numerical column then we can standardize Num by column G by using the below given command −
DT[,"Num":=as.vector(scale(Num)),by=G]
Example 1
Consider the below data.table object −
library(data.table) Grp<-sample(c("Male","Female"),20,replace=TRUE) Response<-round(rnorm(20,5,1.25),2) DT1<-data.table(Grp,Response) DT1
The following dataframe is created
Grp Response 1: Female 5.31 2: Male 5.20 3: Female 6.38 4: Male 4.53 5: Female 4.90 6: Female 4.78 7: Male 3.73 8: Female 6.19 9: Male 4.33 10: Male 7.84 11: Male 6.70 12: Female 5.11 13: Male 6.80 14: Male 3.76 15: Male 3.56 16: Male 5.51 17: Female 6.58 18: Female 7.59 19: Male 4.62 20: Female 6.75
To standardize Response column by Grp column in DT1 on the above created data frame, add the following code to the above snippet −
library(data.table) Grp<-sample(c("Male","Female"),20,replace=TRUE) Response<-round(rnorm(20,5,1.25),2) DT1<-data.table(Grp,Response) DT1[,"Response":=as.vector(scale(Response)),by=Grp] DT1
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Grp Response 1: Female -0.66313371 2: Male 0.03955265 3: Female 0.43789692 4: Male -0.43061348 5: Female -1.08502396 6: Female -1.20850403 7: Male -0.99200587 8: Female 0.24238681 9: Male -0.57096158 10: Male 1.89214752 11: Male 1.09216337 12: Female -0.86893383 13: Male 1.16233742 14: Male -0.97095365 15: Male -1.11130175 16: Male 0.25709220 17: Female 0.64369704 18: Female 1.68298763 19: Male -0.36745684 20: Female 0.81862714
Example 2
Following snippet creates a sample data frame −
Class<-sample(c("I","II","III"),20,replace=TRUE) Rate<-round(rnorm(20,10,1.02),0) DT2<-data.table(Class,Rate) DT2
The following dataframe is created
Class Rate 1: II 10 2: III 9 3: II 10 4: II 10 5: III 10 6: III 9 7: III 8 8: II 10 9: II 11 10: III 9 11: I 9 12: II 11 13: III 13 14: II 10 15: III 12 16: I 8 17: II 9 18: I 10 19: III 9 20: II 10
To standardize Rate column by Class column in DT2 on the above created data frame, add the following code to the above snippet −
Class<-sample(c("I","II","III"),20,replace=TRUE) Rate<-round(rnorm(20,10,1.02),0) DT2<-data.table(Class,Rate) DT2[,"Rate":=as.vector(scale(Rate)),by=Class] DT2
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Class Rate 1: II -0.18490007 2: III -0.50669175 3: II -0.18490007 4: II -0.18490007 5: III 0.07238454 6: III -0.50669175 7: III -1.08576803 8: II -0.18490007 9: II 1.47920052 10: III -0.50669175 11: I 0.00000000 12: II 1.47920052 13: III 1.80961338 14: II -0.18490007 15: III 1.23053710 16: I -1.00000000 17: II -1.84900065 18: I 1.00000000 19: III -0.50669175 20: II -0.18490007