In this short guide, you’ll see how to export matplotlib charts to a PDF file.
More specifically, you’ll observe how to export scatter and line charts into a PDF file.
The Dataset Used for the Scatter Chart
Let’s say that you want to display the relationship between two variables:
- unemployment_rate
- index_price
You may then use a scatter diagram to depict such a relationship.
Here is an example of a dataset that you may use:
unemployment_rate | index_price |
6.1 | 1500 |
5.8 | 1520 |
5.7 | 1525 |
5.7 | 1523 |
5.8 | 1515 |
5.6 | 1540 |
5.5 | 1545 |
5.3 | 1560 |
5.2 | 1555 |
5.2 | 1565 |
You can then capture the above data in Python using Pandas DataFrame:
import pandas as pd data_1 = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2], 'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565] } df_1 = pd.DataFrame(data_1) print(df_1)
Let’s now review the second dataset.
The Dataset Used for the Line Chart
Similarly, you may want to capture the trend overtime for the following variables:
- year
- unemployment_rate
Line charts are useful for such situations. Here is the data that you may use:
year | unemployment_rate |
1920 | 9.8 |
1930 | 12 |
1940 | 8 |
1950 | 7.2 |
1960 | 6.9 |
1970 | 7 |
1980 | 6.5 |
1990 | 6.2 |
2000 | 5.5 |
2010 | 6.3 |
You can then apply this code to create the second DataFrame:
import pandas as pd data_2 = {'year': [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010], 'unemployment_rate': [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3] } df_2 = pd.DataFrame(data_2) print(df_2)
Create the Scatter Chart
To create the scatter chart you’ll need to use the matplotlib package in Python.
The first DataFrame will be used to create the scatter chart:
import pandas as pd import matplotlib.pyplot as plt data_1 = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2], 'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565] } df_1 = pd.DataFrame(data_1) plt.scatter(df_1['unemployment_rate'], df_1['index_price'], color='green') plt.title('Unemployment Rate Vs Index Price', fontsize=14) plt.xlabel('Unemployment Rate', fontsize=14) plt.ylabel('Index Price', fontsize=14) plt.grid(True) plt.show()
Create the Line Chart
Next, you’ll need to use the second DataFrame to create the line chart:
import pandas as pd import matplotlib.pyplot as plt data_2 = {'year': [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010], 'unemployment_rate': [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3] } df_2 = pd.DataFrame(data_2) plt.plot(df_2['year'], df_2['unemployment_rate'], color='red', marker='o') plt.title('Unemployment Rate Vs Year', fontsize=14) plt.xlabel('Year', fontsize=14) plt.ylabel('Unemployment Rate', fontsize=14) plt.grid(True) plt.show()
In the final section of this guide, you’ll see how to put together all the needed elements in order to export the matplotlib charts to a PDF.
Export the Matplotlib Charts to a PDF
To export the matplotlib charts to a PDF, you’ll need to import the matplotlib module as follows:
from matplotlib.backends.backend_pdf import PdfPages
You’ll also need to specify the path where you’d like to export the PDF file. Here is an example of a path (where the file name is “charts” and the file extension is .pdf):
C:\Users\Ron\Desktop\charts.pdf
You’ll need to modify that path to the location where you’d like to store the PDF file on your computer.
Putting all the components together:
import pandas as pd import matplotlib.pyplot as plt from matplotlib.backends.backend_pdf import PdfPages data_1 = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2], 'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565] } df_1 = pd.DataFrame(data_1) data_2 = {'year': [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010], 'unemployment_rate': [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3] } df_2 = pd.DataFrame(data_2) with PdfPages(r'C:\Users\Ron\Desktop\charts.pdf') as export_pdf: plt.scatter(df_1['unemployment_rate'], df_1['index_price'], color='green') plt.title('Unemployment Rate Vs Index Price', fontsize=14) plt.xlabel('Unemployment Rate', fontsize=14) plt.ylabel('Index Price', fontsize=14) plt.grid(True) export_pdf.savefig() plt.close() plt.plot(df_2['year'], df_2['unemployment_rate'], color='red', marker='o') plt.title('Unemployment Rate Vs Year', fontsize=14) plt.xlabel('Year', fontsize=14) plt.ylabel('Unemployment Rate', fontsize=14) plt.grid(True) export_pdf.savefig() plt.close()
Once you run the above code, a new PDF file (called charts) would be created at your specified location.
Finally, you may want to check the following guide to learn more about creating matplotlib charts in Python.