# How to get the Descriptive Statistics for Pandas DataFrame

Need to get the descriptive statistics for pandas DataFrame?

If so, you can use the following template to get the descriptive statistics for your DataFrame:

`DataFrame.describe(df['DataFrame Field'])`

In the next section, I’ll show you the steps to derive the descriptive statistics using an example.

## Steps to get the Descriptive Statistics for Pandas DataFrame

### Step 1: Collect the data

To start, you’ll need to collect the data for your DataFrame. For example, I collected the following data about cars:

 Brand Price Year Honda Civic 22000 2014 Ford Focus 27000 2015 Toyota Corolla 25000 2016 Toyota Corolla 29000 2017 Audi A4 35000 2018

### Step 2: Create the DataFrame

Next, you’ll need to create the DataFrame based on the data collected.

For our example, the code to create the DataFrame is:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])
print (df)
```

Run the code in Python, and you’ll get this DataFrame:

### Step 3: Get the Descriptive Statistics for Pandas DataFrame

Once you have your DataFrame ready, you’ll be able to get the descriptive statistics using the template that we saw at the beginning of this post:

```DataFrame.describe(df['DataFrame Field'])
```

Let’s say that you want to get the descriptive statistics for the ‘Price’ field, which contains numerical data. In that case, the syntax that you’ll need to use is:

```DataFrame.describe(df['Price'])
```

And the complete Python code would look like this:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_numeric = DataFrame.describe(df['Price'])
print (stats_numeric)
```

Once you run the code, you’ll get the descriptive statistics for the ‘Price’ field:

You’ll notice that the output contains 6 decimal places. You may then add the syntax of astype (int) to the code to get integer values.

This is how the code would look like:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_numeric = DataFrame.describe(df['Price']).astype (int)
print (stats_numeric)
```

Run the code, and you’ll get only integers:

## Breaking down the Descriptive Statistics

You can further breakdown the descriptive statistics into the following measures:

 Measure Python code Count `df['DataFrame Field'].count()` Mean `df['DataFrame Field'].mean()` Standard deviation `df['DataFrame Field'].std()` Minimum `df['DataFrame Field'].min()` 0.25 Quantile `df['DataFrame Field'].quantile(q=0.25)` 0.50 Quantile (=Median) `df['DataFrame Field'].quantile(q=0.50)` 0.75 Quantile `df['DataFrame Field'].quantile(q=0.75)` Maximum `df['DataFrame Field'].max()` Median `df['DataFrame Field'].median()` Variance `df['DataFrame Field'].var()` Skewness `df['DataFrame Field'].skew()` Kurtosis `df['DataFrame Field'].kurt()`

For our example, the df[‘DataFrame Field’] is df[‘Price’].

Therefore, the full Python code for our example would look like this:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

count1 = df['Price'].count()
print('count: ' + str(count1))

mean1 = df['Price'].mean()
print('mean: ' + str(mean1))

std1 = df['Price'].std()
print('std: ' + str(std1))

min1 = df['Price'].min()
print('min: ' + str(min1))

quantile1 = df['Price'].quantile(q=0.25)
print('25%: ' + str(quantile1))

quantile2 = df['Price'].quantile(q=0.50)
print('50%: ' + str(quantile2))

quantile3 = df['Price'].quantile(q=0.75)
print('75%: ' + str(quantile3))

max1 = df['Price'].max()
print('max: ' + str(max1))

# median = 0.5 quantile
median1 = df['Price'].median()
print('median: ' + str(median1))

var1 = df['Price'].var()
print('var: ' + str(var1))

skew1 = df['Price'].skew()
print('skew: ' + str(skew1))

kurt1 = df['Price'].kurt()
print('kurt: ' + str(kurt1))

```

Once you run the code in Python, you’ll get the following stats:

## Descriptive Statistics for Categorical data

So far we have seen how to get the descriptive statistics for numerical data. We used the ‘Price’ field for that purpose.

Yet, you can also get the descriptive statistics for categorical data.

For instance, you can get some descriptive statistics for the ‘Brand’ field using this code:

```from pandas import DataFrame

Cars = {'Brand': ['Honda Civic','Ford Focus','Toyota Corolla','Toyota Corolla','Audi A4'],
'Price': [22000,27000,25000,29000,35000],
'Year': [2014,2015,2016,2017,2018]
}

df = DataFrame(Cars, columns= ['Brand', 'Price','Year'])

stats_categorical = DataFrame.describe(df['Brand'])
print (stats_categorical)
```

And this is the result that you’ll get: