Create a Scatter Diagram in Python using Matplotlib

The following syntax can be used to create a scatter diagram in Python using Matplotlib:

import matplotlib.pyplot as plt

x_axis = ['value_1', 'value_2', 'value_3', ...]
y_axis = ['value_1', 'value_2', 'value_3', ...]

plt.scatter(x_axis, y_axis)
plt.title('title name')
plt.xlabel('x_axis name')
plt.ylabel('y_axis name')
plt.show()

In the next section, you’ll see the steps to create a scatter diagram using a practical example.

Steps to Create a Scatter Diagram in Python using Matplotlib

Step 1: Install the Matplotlib module

If you haven’t already done so, install the matplotlib module using the following command (under Windows):

pip install matplotlib

You may check this guide for the steps to install a module in Python using pip.

Step 2: Gather the data for the scatter diagram

Next, gather the data to be used for the scatter diagram.

For example, let’s say that you have the following dataset:

unemployment_rate index_price
6.1 1500
5.8 1520
5.7 1525
5.7 1523
5.8 1515
5.6 1540
5.5 1545
5.3 1560
5.2 1555
5.2 1565

The ultimate goal is to depict the relationship between the unemployment_rate and the index_price.

You can accomplish this goal using a scatter diagram.

Step 3: Capture the data in Python

You can capture the above data in Python using lists:

unemployment_rate = [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2]
index_price = [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]

print(unemployment_rate)
print(index_price)

If you run the above code in Python, you’ll get the following lists with the required information:

[6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2]
[1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]

Step 4: Create the scatter diagram in Python using Matplotlib

For this final step, you may use the template below in order to create a scatter diagram in Python:

import matplotlib.pyplot as plt

x_axis = ['value_1', 'value_2', 'value_3', ...]
y_axis = ['value_1', 'value_2', 'value_3', ...]

plt.scatter(x_axis, y_axis)
plt.title('title name')
plt.xlabel('x_axis name')
plt.ylabel('y_axis name')
plt.show()

For our example:

import matplotlib.pyplot as plt
   
unemployment_rate = [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2]
index_price = [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]
  
plt.scatter(unemployment_rate, index_price, color='green')
plt.title('Unemployment Rate Vs Index Price', fontsize=14)
plt.xlabel('Unemployment Rate', fontsize=14)
plt.ylabel('Index Price', fontsize=14)
plt.grid(True)
plt.show()

Run the code in Python, and you’ll get the scatter diagram.

Optionally: Create the Scatter Diagram using Pandas DataFrame

So far, you have seen how to capture the dataset in Python using lists (step 3 above).

Optionally, you may capture the data using Pandas DataFrame. The result would be the same under both cases.

Here is the Python code using Pandas DataFrame:

import pandas as pd
import matplotlib.pyplot as plt

data = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
        'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]
        }

df = pd.DataFrame(data)

plt.scatter(df['unemployment_rate'], df['index_price'], color='green')
plt.title('Unemployment Rate Vs Index Price', fontsize=14)
plt.xlabel('Unemployment Rate', fontsize=14)
plt.ylabel('Index Price', fontsize=14)
plt.grid(True)
plt.show()

As before, you’ll get the same scatter diagram.