Create a Scatter Diagram in Python using Matplotlib

To create a scatter diagram in Python using Matplotlib:

import matplotlib.pyplot as plt

x_axis = ["value_1", "value_2", "value_3", ...]
y_axis = ["value_1", "value_2", "value_3", ...]

plt.scatter(x_axis, y_axis)
plt.title("title name")
plt.xlabel("x_axis name")
plt.ylabel("y_axis name")
plt.show()

Steps to Create a Scatter Diagram in Python using Matplotlib

Step 1: Install the Matplotlib package

If you haven’t already done so, install Matplotlib using the following command:

pip install matplotlib

Step 2: Gather the data for the scatter diagram

Next, gather the data to be used for the scatter diagram.

For example, let’s say that you have the following dataset:

unemployment_rateindex_price
6.11500
5.81520
5.71525
5.71523
5.81515
5.61540
5.51545
5.31560
5.21555
5.21565

The ultimate goal is to depict the relationship between the unemployment_rate and the index_price.

You can accomplish this goal using a scatter diagram.

Step 3: Create the scatter diagram in Python using Matplotlib

For this final step, you may use the template below in order to create a scatter diagram in Python:

import matplotlib.pyplot as plt

x_axis = ["value_1", "value_2", "value_3", ...]
y_axis = ["value_1", "value_2", "value_3", ...]

plt.scatter(x_axis, y_axis)
plt.title("title name")
plt.xlabel("x_axis name")
plt.ylabel("y_axis name")
plt.show()

For our example:

import matplotlib.pyplot as plt

unemployment_rate = [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2]
index_price = [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]

plt.scatter(unemployment_rate, index_price, color="green")
plt.title("Unemployment Rate Vs Index Price", fontsize=14)
plt.xlabel("Unemployment Rate", fontsize=14)
plt.ylabel("Index Price", fontsize=14)
plt.grid(True)
plt.show()

Run the code in Python, and you’ll get the scatter diagram.

Optionally: Create the Scatter Diagram using Pandas DataFrame

So far, you have seen how to capture the dataset in Python using lists.

Optionally, you may capture the data using Pandas DataFrame. The result would be the same under both cases.

Here is the Python code using Pandas DataFrame:

import pandas as pd
import matplotlib.pyplot as plt

data = {
"unemployment_rate": [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
"index_price": [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565],
}

df = pd.DataFrame(data)

plt.scatter(df["unemployment_rate"], df["index_price"], color="green")
plt.title("Unemployment Rate Vs Index Price", fontsize=14)
plt.xlabel("Unemployment Rate", fontsize=14)
plt.ylabel("Index Price", fontsize=14)
plt.grid(True)
plt.show()

As before, you’ll get the same scatter diagram.