Open In App

Correlation Matrix in R Programming

Last Updated : 30 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Correlation refers to the relationship between two variables, specifically the degree of linear association between them. In R, a correlation matrix represents this relationship as a range of values between -1 and 1.

  • A value of -1 indicates a perfect negative linear relationship.
  • A value of 1 indicates a perfect positive linear relationship.
  • A value of 0 indicates no linear relationship or independence between the two variables.

Properties of Correlation Matrix

  1. All the diagonal elements of the Correlation Matrix in R must be 1 because the correlation of a variable with itself is always perfect : [Tex]c_{ii}=1 [/Tex]
  2. It should be symmetric : [Tex]c_{ij}=c_{ji}[/Tex].

Computing Correlation Between Variables in R

In R Programming Language, a correlation matrix can be completed using the cor( ) function, which has the following syntax:

 Syntax: cor (x, use = , method =    )

Parameters:

x: It is a numeric matrix or a data frame.
use: Deals with missing data.

  • all.obs: this parameter value assumes that the data frame has no missing values and throws an error in case of violation.
  • complete.obs: listwise deletion.
  • pairwise.complete.obs: pairwise deletion.

method: Deals with a type of relationship. Either Pearson, Spearman, or Kendall can be used for computation. The default method used is Pearson. 

Example: Calculating and Displaying the Correlation Matrix of a Dataset

We are loading a dataset from a CSV file using read.csv() and storing it in the data variable. The head() function displays the first few rows of the dataset. Then, we calculate the correlation matrix of the dataset using the cor() function and store it in cor_data.

R
data = read.csv("https://people.sc.fsu.edu/~jburkardt/data/csv/ford_escort.csv", 
                header = TRUE, fileEncoding = "latin1")

print ("Original Data")
head(data)

cor_data = cor(data)

print("Correlation matrix")
return(cor_data)

 Output:

corr

Correlation Matrix

Computing Correlation Coefficients of Correlation Matrix in R

R provides the built-in rcorr() function, which calculates correlation coefficients and generates a table of p-values for all possible pairs of columns in a data frame. This function computes the significance levels for both Pearson and Spearman correlations, allowing to assess the strength and statistical significance of relationships between variables.

P-values indicate whether the observed correlation is likely to be genuine or if it occurred by chance. A low p-value (≤ 0.05) suggests strong evidence that the correlation is meaningful, while a high p-value (> 0.05) indicates that the relationship may not be significant, helping to avoid misleading conclusions from the data.

Syntax:

rcorr (x, type = c(“pearson”, “spearman”))

To use this function in R, we need to download and load the “Hmisc” package into the environment.

R
install.packages("Hmisc")
library("Hmisc")

Example: Calculating Correlation Coefficients and P-Values Using rcorr()

We are installing the Hmisc package and loading it with library(). Then, we use the rcorr() function to calculate the correlation coefficients and p-values for the data dataset, after converting it to a matrix using as.matrix(). Finally, we print the results stored in p_values

R
install.packages("Hmisc")
library("Hmisc")

p_values <- rcorr(as.matrix(data))

print(p_values)

 Output:

p-values

P values table

Visualizing a Correlation Matrix in R

To visualize a correlation matrix in R, we use the corrplot package to create a correlogram. To install the package we use the install.library() function and to load it into the R script we use the library() function.

R
install.packages("corrplot")
library("corrplot")

A correlogramis a visual representation of a correlation matrix, showing the strength and direction of relationships between variables. There are different types of correlogram that can be plot using the corrplot() function.

1. Visualize Correlogram as a circle chart

We are installing the corrplot package, calculating the correlation matrix for the mtcarsdataset, and visualizing it as a correlogram using circular symbols to show the strength and direction of the correlations.

R
install.packages("corrplot")
library(corrplot)
data(mtcars)

M<-cor(mtcars)


corrplot(M, method="circle")

Output:

circle

Circle Plot

2. Visualize Correlogram as a pie chart

We are visualizing the correlation matrix M as a correlogram using pie charts to represent the strength and direction of correlations between variables

R
corrplot(M, method="pie")

Output:

3. Visualize Correlogram as color chart

We are visualizing the correlation matrix M as a correlogram using colors to represent the strength and direction of the correlations between variables.

R
corrplot(M, method="color")

Output:

4. Visualize Correlogram as numbers

We are visualizing the correlation matrix M as a correlogram by displaying the correlation coefficients as numbers in each cell.

R
corrplot(M, method="number")

Output:

5. Visualize Correlogram as elipse

We are visualizing the correlation matrix M as a correlogram using ellipses to represent the strength and direction of correlations between variables

R
corrplot(M, method="ellipse")

Output:

gh

Correlation Matrix in R Programming

6. Visualize Correlogram as Density Plot

We are visualizing the correlation matrix M using shaded colors, where the shading intensity represents the strength and direction of the correlations between variables

R
corrplot(M, method="shade")

Output:

gh

Correlation Matrix in R Programming

Choosing the Right Visualization Type

We choose the visualization method that best suits our needs or preferences.

MethodDescriptionWhat it Highlights
CircleDisplays correlations as circles.Highlights the strength (size) and direction (color) of the correlation. A good option for a simple, intuitive overview.
PieDisplays correlations as pie charts.Highlights the proportions of correlations, with slice size representing the strength. Best for visualizing the relative size of correlations.
ColorDisplays correlations using a color gradient.Highlights the strength and direction of correlations using a color scale. Great for clearly distinguishing positive and negative correlations.
NumberDisplays correlation coefficients as numbers inside each cell.Highlights the exact numerical values of correlations. Useful for precise analysis where you need the exact strength of relationships.
EllipseDisplays correlations as ellipses.Highlights the linear relationship strength and direction with the shape and orientation of ellipses. Best for identifying patterns visually.
ShadeDisplays correlations with shaded areas.Highlights the strength of correlations with varying intensity. Useful when you want to emphasize the magnitude of correlations visually.

In this article, we explored how to compute and visualize correlation matrices in R, using the cor() function and the corrplot package to assess relationships between variables.



Next Article

Similar Reads