How to Export Matplotlib Charts to a PDF

In this short guide, you’ll see how to export matplotlib charts to a PDF file.

More specifically, you’ll observe how to export scatter and line charts into a PDF file.

The Dataset Used for the Scatter Chart

Let’s say that you want to display the relationship between two variables:

  • unemployment_rate
  • index_price

You may then use a scatter diagram to depict such a relationship.

Here is an example of a dataset that you may use:

unemployment_rate index_price
6.1 1500
5.8 1520
5.7 1525
5.7 1523
5.8 1515
5.6 1540
5.5 1545
5.3 1560
5.2 1555
5.2 1565

You can then capture the above data in Python using Pandas DataFrame:

import pandas as pd

data_1 = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
          'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]
          }

df_1 = pd.DataFrame(data_1)
print(df_1)

Let’s now review the second dataset.

The Dataset Used for the Line Chart

Similarly, you may want to capture the trend overtime for the following variables:

  • year
  • unemployment_rate

Line charts are useful for such situations. Here is the data that you may use:

year unemployment_rate
1920 9.8
1930 12
1940 8
1950 7.2
1960 6.9
1970 7
1980 6.5
1990 6.2
2000 5.5
2010 6.3

You can then apply this code to create the second DataFrame:

import pandas as pd

data_2 = {'year': [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010],
          'unemployment_rate': [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3]
          }

df_2 = pd.DataFrame(data_2)
print(df_2)

Create the Scatter Chart

To create the scatter chart you’ll need to use the matplotlib package in Python.

The first DataFrame will be used to create the scatter chart:

import pandas as pd
import matplotlib.pyplot as plt

data_1 = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
          'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]
          }

df_1 = pd.DataFrame(data_1)

plt.scatter(df_1['unemployment_rate'], df_1['index_price'], color='green')
plt.title('Unemployment Rate Vs Index Price', fontsize=14)
plt.xlabel('Unemployment Rate', fontsize=14)
plt.ylabel('Index Price', fontsize=14)
plt.grid(True)
plt.show()

Create the Line Chart

Next, you’ll need to use the second DataFrame to create the line chart:

import pandas as pd
import matplotlib.pyplot as plt

data_2 = {'year': [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010],
          'unemployment_rate': [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3]
          }

df_2 = pd.DataFrame(data_2)

plt.plot(df_2['year'], df_2['unemployment_rate'], color='red', marker='o')
plt.title('Unemployment Rate Vs Year', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.ylabel('Unemployment Rate', fontsize=14)
plt.grid(True)
plt.show()

In the final section of this guide, you’ll see how to put together all the needed elements in order to export the matplotlib charts to a PDF.

Export the Matplotlib Charts to a PDF

To export the matplotlib charts to a PDF, you’ll need to import the matplotlib module as follows:

from matplotlib.backends.backend_pdf import PdfPages

You’ll also need to specify the path where you’d like to export the PDF file. Here is an example of a path (where the file name is “charts” and the file extension is .pdf):

C:\Users\Ron\Desktop\charts.pdf

You’ll need to modify that path to the location where you’d like to store the PDF file on your computer.

Putting all the components together:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

data_1 = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
          'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]
          }

df_1 = pd.DataFrame(data_1)

data_2 = {'year': [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010],
          'unemployment_rate': [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3]
          }

df_2 = pd.DataFrame(data_2)

with PdfPages(r'C:\Users\Ron\Desktop\charts.pdf') as export_pdf:
    plt.scatter(df_1['unemployment_rate'], df_1['index_price'], color='green')
    plt.title('Unemployment Rate Vs Index Price', fontsize=14)
    plt.xlabel('Unemployment Rate', fontsize=14)
    plt.ylabel('Index Price', fontsize=14)
    plt.grid(True)
    export_pdf.savefig()
    plt.close()

    plt.plot(df_2['year'], df_2['unemployment_rate'], color='red', marker='o')
    plt.title('Unemployment Rate Vs Year', fontsize=14)
    plt.xlabel('Year', fontsize=14)
    plt.ylabel('Unemployment Rate', fontsize=14)
    plt.grid(True)
    export_pdf.savefig()
    plt.close()

Once you run the above code, a new PDF file (called charts) would be created at your specified location.

Finally, you may want to check the following guide to learn more about creating matplotlib charts in Python.