# How to Get the Descriptive Statistics for Pandas DataFrame

Need to get the descriptive statistics for pandas DataFrame?

If so, you can use the following template to get the descriptive statistics for a specific column in your DataFrame:

`df['DataFrame Column'].describe()`

Alternatively, you may use this template to get the descriptive statistics for the entire DataFrame:

`df.describe(include='all')`

In the next section, I’ll show you the steps to derive the descriptive statistics using an example.

## Steps to Get the Descriptive Statistics for Pandas DataFrame

### Step 1: Collect the Data

To start, you’ll need to collect the data for your DataFrame. For example, I collected the following data about cars:

 Brand Price Year Honda Civic 22000 2014 Ford Focus 27000 2015 Toyota Corolla 25000 2016 Toyota Corolla 29000 2017 Audi A4 35000 2018

### Step 2: Create the DataFrame

Next, you’ll need to create the DataFrame based on the data collected.

For our example, the code to create the DataFrame is:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])
print (df)
```

Run the code in Python, and you’ll get this DataFrame: ### Step 3: Get the Descriptive Statistics for Pandas DataFrame

Once you have your DataFrame ready, you’ll be able to get the descriptive statistics using the template that you saw at the beginning of this guide:

```df['DataFrame Column'].describe()
```

Let’s say that you want to get the descriptive statistics for the ‘Price’ field, which contains numerical data. In that case, the syntax that you’ll need to apply is:

```df['Price'].describe()
```

So the complete Python code would look like this:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_numeric = df['Price'].describe()
print (stats_numeric)
```

Once you run the code, you’ll get the descriptive statistics for the ‘Price’ field: You’ll notice that the output contains 6 decimal places. You may then add the syntax of astype (int) to the code to get integer values.

This is how the code would look like:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_numeric = df['Price'].describe().astype (int)
print (stats_numeric)
```

Run the code, and you’ll get only integers: ## Descriptive Statistics for Categorical Data

So far, you have seen how to get the descriptive statistics for numerical data. The ‘Price’ field was used for that purpose.

Yet, you can also get the descriptive statistics for categorical data.

For instance, you can get some descriptive statistics for the ‘Brand’ field using this code:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_categorical = df['Brand'].describe()
print (stats_categorical)
```

And this is the result that you’ll get: ## Get the Descriptive Statistics for the Entire Pandas DataFrame

Finally, you may apply the following template to get the descriptive statistics for the entire DataFrame:

`df.describe(include='all')`

So the complete Python code would look like this:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats = df.describe(include='all')
print (stats)```

Run the code, and you’ll get the following result: ## Breaking Down the Descriptive Statistics

You can further breakdown the descriptive statistics into the following:

Count:

```df['DataFrame Column'].count()
```

Mean:

```df['DataFrame Column'].mean()
```

Standard deviation:

```df['DataFrame Column'].std()
```

Minimum:

```df['DataFrame Column'].min()
```

0.25 Quantile:

```df['DataFrame Column'].quantile(q=0.25)
```

0.50 Quantile (Median):

```df['DataFrame Column'].quantile(q=0.50)
```

0.75 Quantile:

```df['DataFrame Column'].quantile(q=0.75)
```

Maximum:

```df['DataFrame Column'].max()
```

For our example, the df[‘DataFrame Column’] is df[‘Price’].

Therefore, the full Python code for our example would look like this:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

count1 = df['Price'].count()
print('count: ' + str(count1))

mean1 = df['Price'].mean()
print('mean: ' + str(mean1))

std1 = df['Price'].std()
print('std: ' + str(std1))

min1 = df['Price'].min()
print('min: ' + str(min1))

quantile1 = df['Price'].quantile(q=0.25)
print('25%: ' + str(quantile1))

quantile2 = df['Price'].quantile(q=0.50)
print('50%: ' + str(quantile2))

quantile3 = df['Price'].quantile(q=0.75)
print('75%: ' + str(quantile3))

max1 = df['Price'].max()
print('max: ' + str(max1))
```

Once you run the code in Python, you’ll get the following stats: 