In this short guide, you’ll see how to export Matplotlib charts to a PDF file.
More specifically, you’ll observe how to export scatter and line charts into a PDF file.
The Dataset used for the Scatter Chart
Let’s say that you want to display the relationship between two variables:
- unemployment_rate
- index_price
You may then use a scatter diagram to depict such a relationship.
Here is an example of a dataset that you may use:
unemployment_rate | index_price |
6.1 | 1500 |
5.8 | 1520 |
5.7 | 1525 |
5.7 | 1523 |
5.8 | 1515 |
5.6 | 1540 |
5.5 | 1545 |
5.3 | 1560 |
5.2 | 1555 |
5.2 | 1565 |
You can then capture the above data in Python using Pandas DataFrame:
import pandas as pd
data_1 = {
"unemployment_rate": [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
"index_price": [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565],
}
df_1 = pd.DataFrame(data_1)
print(df_1)
Let’s now review the second dataset.
The Dataset used for the Line Chart
Similarly, you may want to capture the trend overtime for the following variables:
- year
- unemployment_rate
Line charts are useful for such situations. Here is the data that you may use:
year | unemployment_rate |
1920 | 9.8 |
1930 | 12 |
1940 | 8 |
1950 | 7.2 |
1960 | 6.9 |
1970 | 7 |
1980 | 6.5 |
1990 | 6.2 |
2000 | 5.5 |
2010 | 6.3 |
You can then apply this code to create the second DataFrame:
import pandas as pd
data_2 = {
"year": [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010],
"unemployment_rate": [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3],
}
df_2 = pd.DataFrame(data_2)
print(df_2)
Create the Scatter Chart
Create the scatter chart using the first DataFrame:
import pandas as pd
import matplotlib.pyplot as plt
data_1 = {
"unemployment_rate": [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
"index_price": [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565],
}
df_1 = pd.DataFrame(data_1)
plt.scatter(df_1["unemployment_rate"], df_1["index_price"], color="green")
plt.title("Unemployment Rate Vs Index Price", fontsize=14)
plt.xlabel("Unemployment Rate", fontsize=14)
plt.ylabel("Index Price", fontsize=14)
plt.grid(True)
plt.show()
Create the Line Chart
Next, create the line chart using the second DataFrame:
import pandas as pd
import matplotlib.pyplot as plt
data_2 = {
"year": [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010],
"unemployment_rate": [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3],
}
df_2 = pd.DataFrame(data_2)
plt.plot(df_2["year"], df_2["unemployment_rate"], color="red", marker="o")
plt.title("Unemployment Rate Vs Year", fontsize=14)
plt.xlabel("Year", fontsize=14)
plt.ylabel("Unemployment Rate", fontsize=14)
plt.grid(True)
plt.show()
In the final section of this guide, you’ll see how to put together all the needed components in order to export the Matplotlib charts to a PDF.
Export the Matplotlib Charts to a PDF
To export the Matplotlib charts to a PDF, you’ll first need to import the Matplotlib module as follows:
from matplotlib.backends.backend_pdf import PdfPages
You’ll also have to specify the path where you’d like to export the PDF file. Here is an example of a path (where the file name is “charts” and the file extension is .pdf):
C:\Users\Ron\Desktop\charts.pdf
Modify that path to the location where you’d like to store the PDF file on your computer.
Putting all the components together:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
data_1 = {
"unemployment_rate": [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
"index_price": [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565],
}
df_1 = pd.DataFrame(data_1)
data_2 = {
"year": [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010],
"unemployment_rate": [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3],
}
df_2 = pd.DataFrame(data_2)
with PdfPages(r"C:\Users\Ron\Desktop\charts.pdf") as export_pdf:
plt.scatter(df_1["unemployment_rate"], df_1["index_price"], color="green")
plt.title("Unemployment Rate Vs Index Price", fontsize=14)
plt.xlabel("Unemployment Rate", fontsize=14)
plt.ylabel("Index Price", fontsize=14)
plt.grid(True)
export_pdf.savefig()
plt.close()
plt.plot(df_2["year"], df_2["unemployment_rate"], color="red", marker="o")
plt.title("Unemployment Rate Vs Year", fontsize=14)
plt.xlabel("Year", fontsize=14)
plt.ylabel("Unemployment Rate", fontsize=14)
plt.grid(True)
export_pdf.savefig()
plt.close()
Once you run the above code, a new PDF file (called charts) would be created at your specified location.
Finally, you may want to check the following guide to learn more about creating Matplotlib charts in Python.