How to select row with maximum value in each group in R Language?
Last Updated :
01 Apr, 2021
In R Programming Language, to select the row with the maximum value in each group from a data frame, we can use various approaches as discussed below.
Consider the following dataset with multiple observations in sub-column. This dataset contains three columns as sr_no, sub, and marks.
Creating Dataset :
Here we are creating dataframe for demonstration.
Code block
Output:
roll sub marks
1 1 A 2
2 2 A 3
3 3 B 5
4 4 B 2
5 5 B 5
6 6 C 8
7 7 C 17
8 8 A 3
9 9 C 5
10 10 C 5
Here, roll and marks are integer value and sub is the categorical value (char) have category A, B, C. In this dataset A, B, C represent different subjects and marks are marks obtained in the corresponding sub.
As we can see subject A, B, C has the maximum value (marks) of 3,5,17 respectively in the group. We can select the max row in the group using the following two approaches.
Methods 1: Using R base.
Step 1: Load the dataset into a variable (group).
R
no <- c ( 1 : 10)
subject <- c ( 'A' , 'A' , 'B' , 'B' , 'B' ,
'C' , 'C' , 'A' , 'C' , 'C' )
mark <- c (2, 3, 5, 2, 5, 8, 17, 3, 5, 5)
group <- data.frame (roll = no, sub = subject,
marks = mark )
group
|
Output:
roll sub marks
1 1 A 2
2 2 A 3
3 3 B 5
4 4 B 2
5 5 B 5
6 6 C 8
7 7 C 17
8 8 A 3
9 9 C 5
10 10 C 5
Step 2: Sorted the marks in descending order for each group (A, B, C).
R
no <- c ( 1 : 10)
subject <- c ( 'A' , 'A' , 'B' , 'B' , 'B' ,
'C' , 'C' , 'A' , 'C' , 'C' )
mark <- c (2, 3, 5, 2, 5, 8, 17, 3, 5, 5)
group <- data.frame (roll = no, sub = subject,
marks = mark )
sorted_group <- group[ order (group$sub, -group$marks),]
sorted_group
|
Output:
roll sub marks
2 2 A 3
8 8 A 3
1 1 A 2
3 3 B 5
5 5 B 5
4 4 B 2
7 7 C 17
6 6 C 8
9 9 C 5
10 10 C 5
As our sub is now in ascending order, and we are ready to select the row with max value in each group, here groups are A, B, C.
Step 3: Remove the duplicate rows from the sorted subject column.
R
no <- c ( 1 : 10)
subject <- c ( 'A' , 'A' , 'B' , 'B' , 'B' ,
'C' , 'C' , 'A' , 'C' , 'C' )
mark <- c (2, 3, 5, 2, 5, 8, 17, 3, 5, 5)
group <- data.frame (roll = no, sub = subject,
marks = mark )
sorted_group <- group[ order (group$sub, -group$marks),]
ans <- sorted_group[! duplicated (sorted_group$sub),]
ans
|
Output:

These are the selected row with the maximum value in each group.
Methods 2: Using dplyr package
dplyr is an R package which is most commonly used to manipulate the data frame. dplyr provides various verbs (functions) for data manipulation such as filter, arrange, select, rename, mutate etc.
To install dplyr package we have to run the following command in the R console.
install.packages("dplyr")
Step1: Load the dataset and library.
R
no <- c ( 1 : 10)
subject <- c ( 'A' , 'A' , 'B' , 'B' , 'B' ,
'C' , 'C' , 'A' , 'C' , 'C' )
mark <- c (2, 3, 5, 2, 5, 8, 17, 3, 5, 5)
group <- data.frame (roll = no, sub = subject,
marks = mark )
library ( "dplyr" )
|
Step 2: Now group the data frame sub using group_ by verb (function) and select the row having maximum marks using which.max().
R
no <- c ( 1 : 10)
subject <- c ( 'A' , 'A' , 'B' , 'B' , 'B' ,
'C' , 'C' , 'A' , 'C' , 'C' )
mark <- c (2, 3, 5, 2, 5,
8, 17, 3, 5, 5)
group <- data.frame (roll = no, sub = subject,
marks = mark )
library ( "dplyr" )
group %>% group_by (sub) %>% slice ( which.max (marks))
|
Output:

As we can see these are the selected row with the maximum value in each group.
Similar Reads
Select Top N Highest Values by Group in R
In this article, we are going to see how to select the Top Nth highest value by the group in R language. Method 1: Using Reduce method The dataframe can be ordered by group in descending order of their values by the order method. The corresponding dataframe is then accessed using the indexing method
5 min read
How to extract the dataframe row with min or max values in R ?
The tabular arrangement of rows and columns to form a data frame in R Programming Language supports many ways to access and modify the data. Application of queries and aggregate functions, like min, max and count can easily be made over the data frame cell values. Therefore, it is relatively very ea
5 min read
Select DataFrame Rows where Column Values are in Range in R
In this article, we will discuss how to select dataframe rows where column values are in a range in R programming language. Data frame indexing can be used to extract rows or columns from the dataframe. The condition can be applied to the specific columns of the dataframe and combined using the logi
2 min read
Find the index of the maximum value in R DataFrame
In this article, we will see how to find the index of the maximum value from a DataFrame in the R Programming Language We can find the maximum value index in a dataframe using the which.max() function. Syntax: which.max(dataframe_name$columnname) "$" is used to access particular column of a datafram
2 min read
How to find Nth smallest value in vector in R ?
In this article, we will discuss how to find the Nth smallest in vector in the R programming language. Steps -Create vectorTake input from the user using the function readline().Convert data from string to int using the function as.integer().In this step, we are finding nth largest number using Synt
1 min read
Select First Row of Each Group in DataFrame in R
In this article, we will discuss how to select the first row of each group in Dataframe using R programming language. The duplicated() method is used to determine which of the elements of a dataframe are duplicates of other elements. The method returns a logical vector which tells which of the rows
2 min read
Select rows of a matrix in R that meet a condition
A large dataset is often required to be filtered according to our requirements. In this article, we will be discussing how we can select a row from a matrix in R that meets the condition. For better understanding let's understand the problem statement with the help of an example. Example: Data in us
2 min read
Select rows from a DataFrame based on values in a vector in R
In this article, we will discuss how to select rows from a DataFrame based on values in a vector in R Programming Language. Method 1: Using %in% operator %in% operator in R, is used to identify if an element belongs to a vector or Dataframe. It is used to perform a selection of the elements satisfyi
5 min read
How to sum a variable by group in R?
In this article, let's discusses how to find sum of a variable by the group in R Programming Language. Dataset in Use: Expected output: ApproachCreate dataframeSet values to required parametersPass to the function in useDisplay dataframeMethod 1: Using aggregate function In this method we will take
2 min read
Select Rows if Value in One Column is Smaller Than in Another in R Dataframe
In this article, we will discuss how to select rows if the value in one column is smaller than another in dataframe in R programming language. Data frame in use: Method 1: Using Square Brackets By using < operator inside the square bracket we can return the required rows. Syntax: dataframe[datafr
2 min read