In this short guide, you’ll see two different methods to create Pandas DataFrame:
- By typing the values in Python itself to create the DataFrame
- By importing the values from a file (such as a CSV file), and then creating the DataFrame in Python based on the values imported
Method 1: typing the values in Python to create Pandas DataFrame
To create Pandas DataFrame in Python, you can follow this generic template:
import pandas as pd data = {'first_column': ['first_value', 'second_value', ...], 'second_column': ['first_value', 'second_value', ...], .... } df = pd.DataFrame(data) print(df)
Note that you don’t need to use quotes around numeric values (unless you wish to capture those values as strings).
Now let’s see how to apply the above template using a simple example.
To start, let’s say that you have the following data about products, and that you want to capture that data in Python using Pandas DataFrame:
product_name | price |
laptop | 1200 |
printer | 150 |
tablet | 300 |
desk | 450 |
chair | 200 |
You may then use the code below in order to create the DataFrame for our example:
import pandas as pd data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'], 'price': [1200, 150, 300, 450, 200] } df = pd.DataFrame(data) print(df)
Run the code in Python, and you’ll get the following DataFrame:
product_name price
0 laptop 1200
1 printer 150
2 tablet 300
3 desk 450
4 chair 200
Notice that each row is represented by a number (also known as the index) starting from 0. Alternatively, you may assign another value/name to represent each row.
For example, in the code below, the index=[‘product_1’, ‘product_2’, ‘product_3’, ‘product_4’, ‘product_5’] was added:
import pandas as pd data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'], 'price': [1200, 150, 300, 450, 200] } df = pd.DataFrame(data, index=['product_1', 'product_2', 'product_3', 'product_4', 'product_5']) print(df)
You’ll now see the newly assigned index (as highlighted in yellow):
product_name price
product_1 laptop 1200
product_2 printer 150
product_3 tablet 300
product_4 desk 450
product_5 chair 200
Let’s now review the second method of importing the values into Python to create the DataFrame.
Method 2: importing values from a CSV file to create Pandas DataFrame
You may use the following template to import a CSV file into Python in order to create your DataFrame:
import pandas as pd data = pd.read_csv(r'Path where the CSV file is stored\File name.csv') df = pd.DataFrame(data) print(df)
Let’s say that you have the following data stored in a CSV file (where the CSV file name is ‘products’):
product_name | price |
laptop | 1200 |
printer | 150 |
tablet | 300 |
desk | 450 |
chair | 200 |
In the Python code below, you’ll need to change the path name to reflect the location where the CSV file is stored on your computer.
For example, let’s suppose that the CSV file is stored under the following path:
‘C:\Users\Ron\Desktop\products.csv’
Here is the full Python code for our example:
import pandas as pd data = pd.read_csv(r'C:\Users\Ron\Desktop\products.csv') df = pd.DataFrame(data) print(df)
As before, you’ll get the same Pandas DataFrame in Python:
product_name price
0 laptop 1200
1 printer 150
2 tablet 300
3 desk 450
4 chair 200
You can also create the same DataFrame by importing an Excel file into Python using Pandas.
Find the maximum value in the DataFrame
Once you have your values in the DataFrame, you can perform a large variety of operations. For example, you may calculate stats using Pandas.
For instance, let’s say that you want to find the maximum price among all the products within the DataFrame.
To get the maximum price for our example, you’ll need to add the following portion to the Python code (and then print the results):
max_price = df['price'].max()
Here is the complete Python code:
import pandas as pd data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'], 'price': [1200, 150, 300, 450, 200] } df = pd.DataFrame(data) max_price = df['price'].max() print(max_price)
Once you run the code, you’ll get the value of 1200, which is indeed the maximum price:
1200
You may check the Pandas Documentation to learn more about creating a DataFrame.