Matplotlib - Scatter Plot



A scatter plot is a type of graph where individual data points are plotted on a two-dimensional plane. Each point represents the values of two variables, with one variable on the x-axis and the other on the y-axis. Scatter plots are useful for visualizing the correlation between two continuous variables.

Scatter Plot

Scatter Plot in Matplotlib

We can create a scatter plot in Matplotlib using the scatter() function. This function allows us to customize the appearance of the scatter plot, including markers, colors, and sizes of the points.

The scatter() Function

The scatter() function in Matplotlib takes two arrays or lists as input, where each array corresponds to the values of a different variable. The points are then plotted on a set of axes, with the position of each point determined by the values of the two variables.

Following is the syntax of scatter() function in Matplotlib −

plt.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, ...)

Where,

  • x and y is an array or list representing the x-coordinate and y-coordinate values respectively of the data points.
  • s (optional) is the size of the markers.
  • c (optional) is the color of the markers.
  • marker (optional) is the marker style.
  • cmap (optional) is the colormap for mapping the color of the markers.
  • norm (optional) is the normalize object for mapping the data values to the colormap.
  • vmin, vmax (optional) is the minimum and maximum values for normalizing the colormap.

These are just a few parameters; there are more optionals parameters available for customization.

Colored and Sized Scatter Plot

We can create a colored and sized scatter plot to represent each data point not only by its position on the plot but also by its color and size, providing additional information about the characteristics of each point.

Example

In the following example, we are creating a scatter plot to represent data points with x and y coordinates. We are using the "sizes" list to determine the size of each point, and the "colors" list to specify the color of each point −

import matplotlib.pyplot as plt

# Data
x = [1, 3, 5, 7, 9]
y = [10, 5, 15, 8, 12]
sizes = [30, 80, 120, 40, 60]
colors = ['red', 'green', 'blue', 'yellow', 'purple']

# Creating a scatter plot with varied sizes and colors
plt.scatter(x, y, s=sizes, c=colors)
plt.title('Colored and Sized Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

Output

After executing the above code, we get the following output −

Colored and Sized Scatter Plot

Custom Marker Scatter Plot

A custom marker scatter plot is a type of scatter plot where we represent each data point by a user-defined marker shape. The result is a visually distinctive representation, where each marker shape serves as a unique identifier for individual data points.

Example

In here, we are using square markers (marker='s') of red color in the scatter plot −

import matplotlib.pyplot as plt
x = [2, 4, 6, 8, 10]
y = [5, 8, 12, 6, 9]

# Creating a scatter plot with custom markers
plt.scatter(x, y, marker='s', color='red', edgecolors='black')
plt.title('Custom Marker Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

Output

Following is the output of the above code −

Custom Marker Scatter Plot

Transparent Scatter Plot

In a transparent scatter plot, we plot data points with a certain level of transparency, allowing overlapping points to be partially visible. This transparency can help reveal patterns in densely populated areas of the plot.

Example

Now, we are creating a scatter plot with a level of transparency (i.e. alpha) as 0.6 −

import matplotlib.pyplot as plt
x = [3, 6, 9, 12, 15]
y = [8, 12, 6, 10, 14]

# Creating a scatter plot with transparency
plt.scatter(x, y, alpha=0.2, edgecolors='black')
plt.title('Transparent Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

Output

Output of the above code is as follows −

Transparent Scatter Plot

Connected Scatter Plot

A connected scatter plot is a variation where data points are represented individually, but lines are drawn between consecutive points. This helps to visualize connections between the points over the two-dimensional space.

Example

In the example below, we are creating the scatter plot that shows individual data points connected by dashed lines (linestyle='--') −

import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [10, 15, 5, 12, 8]

# Create a connected scatter plot with dashed lines
plt.scatter(x, y, linestyle='--', color='orange', marker='o')
plt.plot(x, y, linestyle='--', color='orange')
plt.title('Connected Scatter Plot with Dashed Lines')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

Output

The output obtained is as shown below −

Connected Scatter Plot
Advertisements