How to Create Scatter, Line, and Bar Charts using Matplotlib

Matplotlib is a popular Python module that can be used to create charts. In this guide, I’ll show you how to create Scatter, Line and Bar charts using matplotlib.

But before we begin, here is the general syntax that you may use to create your charts using matplotlib:

Scatter plot

import matplotlib.pyplot as plt

plt.scatter(xAxis,yAxis)
plt.title('title name')
plt.xlabel('xAxis name')
plt.ylabel('yAxis name')
plt.show()

Line chart

import matplotlib.pyplot as plt

plt.plot(xAxis,yAxis)
plt.title('title name')
plt.xlabel('xAxis name')
plt.ylabel('yAxis name')
plt.show()

Bar chart

import matplotlib.pyplot as plt

xAxis = [i + 0.5 for i, _ in enumerate(xAxis)]
plt.bar(xAxis,yAxis)
plt.title('title name')
plt.xlabel('xAxis name')
plt.ylabel('yAxis name')
plt.xticks([i + 0.5 for i, _ in enumerate(xAxis)], xAxis)
plt.show()

Let’s now review the steps to create a Scatter plot.

How to Create Scatter Plots using Matplotlib

Scatter plots are used to depict a relationship between two variables.

For example, let’s say that you want to depict the relationship between:

  • The Unemployment Rate; and
  • The Stock Index Price

Here is the dataset associated with those two variables:

Unemployment_RateStock_Index_Price
6.11500
5.81520
5.71525
5.71523
5.81515
5.61540
5.51545
5.31560
5.21555
5.21565

Before you plot that data, you’ll need to capture it in Python. I’ll use 2 different approaches to capture the data in Python via:

  • Lists
  • Pandas DataFrame

Create Scatter Plot using Lists

You can create simple lists, which will contain the values for the Unemployment Rate and the Stock Index Price:

Unemployment_Rate = [6.1,5.8,5.7,5.7,5.8,5.6,5.5,5.3,5.2,5.2]
Stock_Index_Price = [1500,1520,1525,1523,1515,1540,1545,1560,1555,1565]

To create the scatter plot based on the above data, you can apply the generic syntax that was introduced at the beginning of this guide. Your full Python code would look like this:

import matplotlib.pyplot as plt
   
Unemployment_Rate = [6.1,5.8,5.7,5.7,5.8,5.6,5.5,5.3,5.2,5.2]
Stock_Index_Price = [1500,1520,1525,1523,1515,1540,1545,1560,1555,1565]
  
plt.scatter(Unemployment_Rate, Stock_Index_Price, color='green')
plt.title('Unemployment Rate Vs Stock Index Price', fontsize=14)
plt.xlabel('Unemployment Rate', fontsize=14)
plt.ylabel('Stock Index Price', fontsize=14)
plt.grid(True)
plt.show()

Once you run the Python code, you’ll get the following Scatter plot:

How to Create Scatter Charts using Matplotlib

As indicated earlier, this plot depicts the relationship between the Unemployment Rate and the Stock Index Price.

You may notice that a negative relationship exists between those two variables, meaning that when the Unemployment Rate increases, the Stock Index Price falls.

Scatter diagrams are especially useful when applying linear regression. Those types of diagrams can help you determine if there is a linear relationship between the variables – a necessary condition to fulfill before applying linear regression models.

Let’s now see how to create the exact same scatter plot, but only this time, we’ll use pandas DataFrame.

Create Scatter Plot using Pandas DataFrame

Another way in which you can capture the data in Python is by using pandas DataFrame.

You’ll need to install and then import the pandas module, in addition to the matplotlib module.

Using our example, you can then create the pandas DataFrame as follows:

from pandas import DataFrame

Data = {'Unemployment_Rate': [6.1,5.8,5.7,5.7,5.8,5.6,5.5,5.3,5.2,5.2],
        'Stock_Index_Price': [1500,1520,1525,1523,1515,1540,1545,1560,1555,1565]
       }
  
df = DataFrame(Data,columns=['Unemployment_Rate','Stock_Index_Price'])

And here is the full Python code to display the Scatter plot using the DataFrame:

from pandas import DataFrame
import matplotlib.pyplot as plt
   
Data = {'Unemployment_Rate': [6.1,5.8,5.7,5.7,5.8,5.6,5.5,5.3,5.2,5.2],
        'Stock_Index_Price': [1500,1520,1525,1523,1515,1540,1545,1560,1555,1565]
       }
  
df = DataFrame(Data,columns=['Unemployment_Rate','Stock_Index_Price'])
  
plt.scatter(df['Unemployment_Rate'], df['Stock_Index_Price'], color='green')
plt.title('Unemployment Rate Vs Stock Index Price', fontsize=14)
plt.xlabel('Unemployment Rate', fontsize=14)
plt.ylabel('Stock Index Price', fontsize=14)
plt.grid(True)
plt.show()

Once you run the above code, you’ll get the exact same Scatter plot as in the case of using lists:

How to Create Scatter Charts using Matplotlib

Next, we’ll see how to create Line charts.

How to Create Line Charts using Matplotlib

Line charts are often used to display trends overtime.

For example, imagine that you want to present the Unemployment Rate across time using the dataset below:

YearUnemployment_Rate
19209.8
193012
19408
19507.2
19606.9
19707
19806.5
19906.2
20005.5
20106.3

As before, we’ll see how to create the Line chart using lists, and then via the DataFrame.

Create Line Chart using Lists

You may store the Years and the associated Unemployment Rates as lists:

Year = [1920,1930,1940,1950,1960,1970,1980,1990,2000,2010]
Unemployment_Rate = [9.8,12,8,7.2,6.9,7,6.5,6.2,5.5,6.3]

Using the Line chart syntax from the beginning of this guide, your full Python code would be:

import matplotlib.pyplot as plt
   
Year = [1920,1930,1940,1950,1960,1970,1980,1990,2000,2010]
Unemployment_Rate = [9.8,12,8,7.2,6.9,7,6.5,6.2,5.5,6.3]
  
plt.plot(Year, Unemployment_Rate, color='red', marker='o')
plt.title('Unemployment Rate Vs Year', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.ylabel('Unemployment Rate', fontsize=14)
plt.grid(True)
plt.show()

And once you run the Python code, you’ll see the trend of the Unemployment across the years:

How to Create Line Charts using Matplotlib

You’ll notice that based on the data captured, the unemployment rate generally falls over time.

Create Line Chart using pandas DataFrame

The DataFrame, for our example, should look like this:

from pandas import DataFrame

Data = {'Year': [1920,1930,1940,1950,1960,1970,1980,1990,2000,2010],
        'Unemployment_Rate': [9.8,12,8,7.2,6.9,7,6.5,6.2,5.5,6.3]
       }
  
df = DataFrame(Data,columns=['Year','Unemployment_Rate'])

Putting everything together:

from pandas import DataFrame
import matplotlib.pyplot as plt
   
Data = {'Year': [1920,1930,1940,1950,1960,1970,1980,1990,2000,2010],
        'Unemployment_Rate': [9.8,12,8,7.2,6.9,7,6.5,6.2,5.5,6.3]
       }
  
df = DataFrame(Data,columns=['Year','Unemployment_Rate'])
  
plt.plot(df['Year'], df['Unemployment_Rate'], color='red', marker='o')
plt.title('Unemployment Rate Vs Year', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.ylabel('Unemployment Rate', fontsize=14)
plt.grid(True)
plt.show()

You should get the same Line chart when running the code in Python:

Line Diagram

In the final section of this guide, you’ll see how to create a Bar chart.

How to Create Bar Charts using Matplotlib

Bar charts are used to display categorical data.

Let’s say that you want to use a Bar chart to display the GDP Per Capita for a sample of 5 countries:

CountryGDP_Per_Capita
USA45000
Canada42000
Germany52000
UK49000
France47000

Unlike the previous examples, which included only numerical data, the dataset that will be used contains both text and numerical data.

Create a Bar chart using Lists

First, create the lists as follows:

Country = ['USA','Canada','Germany','UK','France']
GDP_Per_Capita = [45000,42000,52000,49000,47000]

Notice that the Country column contains text/strings (wrapped around quotations for each value), while the GDP_Per_Capita column contains numerical values without the quotations.

Since our dataset contains both text and numerical values, you’ll need to add the following syntax:

xAxis = [i + 0.5 for i, _ in enumerate(Country)]

Without the above portion, you’ll face the following error in Python:

unsupported operand type(s) for -: ‘str’ and ‘float’

You’ll also need to incorporate the following section when depicting the bar chart:

plt.xticks([i + 0.5 for i, _ in enumerate(Country)], Country)

When you put all the components together, your full code to create a Bar chart would look like this:

import matplotlib.pyplot as plt
   
Country = ['USA','Canada','Germany','UK','France']
GDP_Per_Capita = [45000,42000,52000,49000,47000]

xAxis = [i + 0.5 for i, _ in enumerate(Country)]
  
plt.bar(xAxis, GDP_Per_Capita, color='teal')
plt.title('Country Vs GDP Per Capita', fontsize=14)
plt.xlabel('Country', fontsize=14)
plt.ylabel('GDP Per Capita', fontsize=14)
plt.xticks([i + 0.5 for i, _ in enumerate(Country)], Country)
plt.show()

Here is the result that you’ll get:

How to Create Bar Charts using Matplotlib

Create a Bar chart using pandas DataFrame

Using pandas DataFrame:

from pandas import DataFrame

Data = {'Country': ['USA','Canada','Germany','UK','France'],
        'GDP_Per_Capita': [45000,42000,52000,49000,47000]
       }
  
df = DataFrame(Data,columns=['Country','GDP_Per_Capita'])

And here is the full Python code to create the Bar Chart using the DataFrame:

from pandas import DataFrame
import matplotlib.pyplot as plt
   
Data = {'Country': ['USA','Canada','Germany','UK','France'],
        'GDP_Per_Capita': [45000,42000,52000,49000,47000]
       }
  
df = DataFrame(Data,columns=['Country','GDP_Per_Capita'])

xAxis = [i + 0.5 for i, _ in enumerate(df['Country'])]
  
plt.bar(xAxis, df['GDP_Per_Capita'].astype(float), color='teal')
plt.title('Country Vs GDP Per Capita', fontsize=14)
plt.xlabel('Country', fontsize=14)
plt.ylabel('GDP Per Capita', fontsize=14)
plt.xticks([i + 0.5 for i, _ in enumerate(df['Country'])], df['Country'])
plt.show()

You’ll get the exact same results:

Bar Diagram

You may want to check the following tutorial that explains how to place your matplotlib charts on a tkinter GUI.

Finally, you can find additional information about the matplotlib module by reviewing the matplotlib documentation.