How to Create Pandas DataFrame in Python

Need to create pandas DataFrame in Python?

If so, there are few methods that you may apply to accomplish this task.

In this tutorial, I’ll show you two different methods to create pandas DataFrame:

  • Type the values in Python itself to create your DataFrame
  • Import values from a file (such as an Excel file), and then create your DataFrame in Python based on the values imported

Method 1: type values in Python to create pandas DataFrame

To create pandas DataFrame in Python, you can follow this generic structure:

 

Data = {'First Field Name': ['First value', 'Second value',...],
        'Second Field Name': ['First value', 'Second value',...],
         ....
        }

df = DataFrame (Data, columns = ['First Field Name','Second Field Name',...])

 

Note that for numerical values, there is no need to use quotations (unless you wish to capture those values as strings).

Now I’m going to show you how to apply the above structure using a simple example. To start, let’s say that you have the following data about Cars, and that you want to capture that data in Python using pandas DataFrame:

 

BrandPrice
Honda Civic22,000
Toyota Corolla25,000
Ford Focus27,000
Audi A435,000

 

This is how the full Python code would look like for our Cars example:

 

from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
        'Price': [22000,25000,27000,35000]
        }

df = DataFrame(Cars, columns= ['Brand', 'Price'])

print (df)

 

Run the Python code, and you’ll get the following DataFrame:

 

How to Create Pandas DataFrame in Python

 

Let’s now review the second method of importing the values into Python to create the DataFrame.

Method 2: import values from an Excel file to create pandas DataFrame

Let’s now assume that you have the following table stored in an Excel file (where the Excel file name is ‘Cars’):

 

BrandPrice
Honda Civic22,000
Toyota Corolla25,000
Ford Focus27,000
Audi A435,000

 

You can then use the syntax read_excel to import the Excel file into Python.

In the Python code below, you’ll need to change the path name to the location where your Excel file is stored on your computer. In my case, the Excel file is saved on my desktop, under the following path:

 ‘C:\Users\Doron E\Desktop\Cars.xlsx’

Once you imported the data into Python, you can then assign those values to the DataFrame. Here is the full Python code for our example:

 

import pandas as pd

Cars = pd.read_excel (r'C:\Users\Doron E\Desktop\Cars.xlsx')
df = pd.DataFrame(Cars, columns= ['Brand', 'Price'])

print (df)

 

As before, you’ll get the same pandas DataFrame in Python:

 

How to Create Pandas DataFrame in Python

 

You can create the same DataFrame if you need to import a CSV file into Python, instead of using an Excel file.

Get the maximum value from the DataFrame

Once you have your values in the DataFrame, you can perform a large variety of actions. For example, you may calculate stats using pandas.

For instance, let’s say that you want to find the maximum price among all the Cars within the DataFrame.

Obviously, you can derive this value just by looking at the data-set, but the method presented below will work for much larger data-sets.

To get the maximum price in our Cars example, you’ll need to add the following portion to the Python code (and then print the results):

 

max1 = df['Price'].max()

 

And the complete Python code is:

 

from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
        'Price': [22000,25000,27000,35000]
        }

df = DataFrame(Cars, columns= ['Brand', 'Price'])

max1 = df['Price'].max()

print (max1)

 

Once you run the code, you’ll get the value of 35,000, which is indeed the maximum price!

You can check the pandas documentation to learn more about creating pandas DataFrame.