In this guide, you’ll see how to plot a DataFrame using Pandas.
More specifically, you’ll see the complete steps to plot:
- Scatter diagram
- Line chart
- Bar chart
- Pie chart
Plot a Scatter Diagram using Pandas
Scatter plots are used to depict a relationship between two variables. Here are the steps to plot a scatter diagram using Pandas.
Step 1: Prepare the data
To start, prepare the data for your scatter diagram.
For example, the following data will be used to create the scatter diagram. This data captures the relationship between two variables:
unemployment_rate | index_price |
6.1 | 1500 |
5.8 | 1520 |
5.7 | 1525 |
5.7 | 1523 |
5.8 | 1515 |
5.6 | 1540 |
5.5 | 1545 |
5.3 | 1560 |
5.2 | 1555 |
5.2 | 1565 |
Step 2: Create the DataFrame
Once you have your data ready, you can proceed to create the DataFrame in Python. For our example, the DataFrame would look like this:
import pandas as pd data = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2], 'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565] } df = pd.DataFrame(data) print(df)
Run the code in Python, and you’ll get the following DataFrame:
unemployment_rate index_price
0 6.1 1500
1 5.8 1520
2 5.7 1525
3 5.7 1523
4 5.8 1515
5 5.6 1540
6 5.5 1545
7 5.3 1560
8 5.2 1555
9 5.2 1565
Step 3: Plot the DataFrame using Pandas
Finally, you can plot the DataFrame by adding the following syntax:
df.plot(x='unemployment_rate', y='index_price', kind='scatter')
Notice that you can specify the type of chart by setting kind=’scatter’
You’ll also need to add the Matplotlib syntax to show the plot (ensure that the Matplotlib package is install in Python):
- import matplotlib.pyplot as plt
- plt.show()
Putting everything together:
import pandas as pd import matplotlib.pyplot as plt data = {'unemployment_rate': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2], 'index_price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565] } df = pd.DataFrame(data) df.plot(x='unemployment_rate', y='index_price', kind='scatter') plt.show()
Once you run the above code, you’ll get the scatter diagram.
Plot a Line Chart using Pandas
Line charts are often used to display trends overtime. Let’s now see the steps to plot a line chart using Pandas.
Step 1: Prepare the data
To start, prepare your data for the line chart. Here is an example of a dataset:
year | unemployment_rate |
1920 | 9.8 |
1930 | 12 |
1940 | 8 |
1950 | 7.2 |
1960 | 6.9 |
1970 | 7 |
1980 | 6.5 |
1990 | 6.2 |
2000 | 5.5 |
2010 | 6.3 |
Step 2: Create the DataFrame
Now create the DataFrame based on the above data:
import pandas as pd data = {'year': [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010], 'unemployment_rate': [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3] } df = pd.DataFrame(data) print(df)
This is how the DataFrame would look like:
year unemployment_rate
0 1920 9.8
1 1930 12.0
2 1940 8.0
3 1950 7.2
4 1960 6.9
5 1970 7.0
6 1980 6.5
7 1990 6.2
8 2000 5.5
9 2010 6.3
Step 3: Plot the DataFrame using Pandas
Finally, plot the DataFrame by adding the following syntax:
df.plot(x ='year', y='unemployment_rate', kind='line')
You’ll notice that the kind is now set to ‘line’ in order to plot the line chart.
Here is the complete Python code:
import pandas as pd import matplotlib.pyplot as plt data = {'year': [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010], 'unemployment_rate': [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3] } df = pd.DataFrame(data) df.plot(x ='year', y='unemployment_rate', kind='line') plt.show()
And once you run the code, you’ll get the line chart.
Plot a Bar Chart using Pandas
Bar charts are used to display categorical data. Let’s now see how to plot a bar chart using Pandas.
Step 1: Prepare your data
As before, you’ll need to prepare your data. Here, the following dataset will be used to create the bar chart:
country | gdp_per_capita |
A | 45000 |
B | 42000 |
C | 52000 |
D | 49000 |
E | 47000 |
Step 2: Create the DataFrame
Create the DataFrame as follows:
import pandas as pd data = {'country': ['A', 'B', 'C', 'D', 'E'], 'gdp_per_capita': [45000, 42000, 52000, 49000, 47000] } df = pd.DataFrame(data) print(df)
You’ll then get this DataFrame:
country gdp_per_capita
0 A 45000
1 B 42000
2 C 52000
3 D 49000
4 E 47000
Step 3: Plot the DataFrame using Pandas
Finally, add the following syntax to the Python code:
df.plot(x='country', y='gdp_per_capita', kind='bar')
In this case, set the kind = ‘bar’ to plot the bar chart.
And the complete Python code is:
import pandas as pd import matplotlib.pyplot as plt data = {'country': ['A', 'B', 'C', 'D', 'E'], 'gdp_per_capita': [45000, 42000, 52000, 49000, 47000] } df = pd.DataFrame(data) df.plot(x='country', y='gdp_per_capita', kind='bar') plt.show()
Run the code and you’ll get the bar chart.
Plot a Pie Chart using Pandas
Step 1: Prepare your data
For demonstration purposes, the following data about the status of tasks was prepared:
tasks_pending | 300 |
tasks_ongoing | 500 |
tasks_completed | 700 |
The goal is to create a pie chart based on the above data.
Step 2: Create the DataFrame
You can then create the DataFrame using this code:
import pandas as pd data = {'tasks': [300, 500, 700]} df = pd.DataFrame(data, index=['tasks_pending', 'tasks_ongoing', 'tasks_completed']) print(df)
You’ll now see this DataFrame:
tasks
tasks_pending 300
tasks_ongoing 500
tasks_completed 700
Step 3: Plot the DataFrame using Pandas
Finally, plot the DataFrame by adding the following syntax:
df.plot.pie(y='tasks', figsize=(5, 5), autopct='%1.1f%%', startangle=90)
And here is the complete Python code:
import pandas as pd import matplotlib.pyplot as plt data = {'tasks': [300, 500, 700]} df = pd.DataFrame(data, index=['tasks_pending', 'tasks_ongoing', 'tasks_completed']) df.plot.pie(y='tasks', figsize=(5, 5), autopct='%1.1f%%', startangle=90) plt.show()
Once you run the code, you’ll get the pie chart.
You just reviewed few examples about plotting DataFrames using Pandas. A good additional source for plotting DataFrames is the Pandas Documentation.