The following syntax can be used to create a scatter diagram in Python using Matplotlib:
import matplotlib.pyplot as plt x_axis = ['value_1', 'value_2', 'value_3', ...] y_axis = ['value_1', 'value_2', 'value_3', ...] plt.scatter(x_axis, y_axis) plt.title('title name') plt.xlabel('x_axis name') plt.ylabel('y_axis name') plt.show()
In the next section, you’ll see the steps to create a scatter diagram using a practical example.
Steps to Create a Scatter Diagram in Python using Matplotlib
Step 1: Install the Matplotlib module
If you haven’t already done so, install the matplotlib module using the following command (under Windows):
pip install matplotlib
You may check this guide for the steps to install a module in Python using pip.
Step 2: Gather the data for the scatter diagram
Next, gather the data to be used for the scatter diagram.
For example, let’s say that you have the following dataset:
unemployment_rate | index_price |
6.1 | 1500 |
5.8 | 1520 |
5.7 | 1525 |
5.7 | 1523 |
5.8 | 1515 |
5.6 | 1540 |
5.5 | 1545 |
5.3 | 1560 |
5.2 | 1555 |
5.2 | 1565 |
The ultimate goal is to depict the relationship between the unemployment_rate and the index_price.
You can accomplish this goal using a scatter diagram.
Step 3: Capture the data in Python
You can capture the above data in Python using lists:
unemployment_rate = [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2] index_price = [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565] print(unemployment_rate) print(index_price)
If you run the above code in Python, you’ll get the following lists with the required information:
[6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2]
[1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]
Step 4: Create the scatter diagram in Python using Matplotlib
For this final step, you may use the template below in order to create a scatter diagram in Python:
import matplotlib.pyplot as plt x_axis = ['value_1', 'value_2', 'value_3', ...] y_axis = ['value_1', 'value_2', 'value_3', ...] plt.scatter(x_axis, y_axis) plt.title('title name') plt.xlabel('x_axis name') plt.ylabel('y_axis name') plt.show()
For our example:
import matplotlib.pyplot as plt unemployment_rate = [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2] index_price = [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565] plt.scatter(unemployment_rate, index_price, color='green') plt.title('Unemployment Rate Vs Index Price', fontsize=14) plt.xlabel('Unemployment Rate', fontsize=14) plt.ylabel('Index Price', fontsize=14) plt.grid(True) plt.show()
Run the code in Python, and you’ll get the scatter diagram.
Optionally: Create the Scatter Diagram using Pandas DataFrame
So far, you have seen how to capture the dataset in Python using lists (step 3 above).
Optionally, you may capture the data using Pandas DataFrame. The result would be the same under both cases.
Here is the Python code using Pandas DataFrame:
import pandas as pd import matplotlib.pyplot as plt data = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2], 'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565] } df = pd.DataFrame(data) plt.scatter(df['unemployment_rate'], df['index_price'], color='green') plt.title('Unemployment Rate Vs Index Price', fontsize=14) plt.xlabel('Unemployment Rate', fontsize=14) plt.ylabel('Index Price', fontsize=14) plt.grid(True) plt.show()
As before, you’ll get the same scatter diagram.