How to Sort Pandas DataFrame (with examples)

You may use df.sort_values in order to sort Pandas DataFrame.

In this short guide, you’ll see 4 examples of sorting:

  1. A column in an ascending order
  2. A column in a descending order
  3. By multiple columns – Case 1
  4. By multiple columns – Case 2

To start with a simple example, let’s say that you have the following data about cars:

BrandPriceYear
HH220002015
TT250002013
FF270002018
AA350002018

You can then capture that data in Python by creating the following DataFrame:

import pandas as pd
 
data = {'Brand': ['HH', 'TT', 'FF', 'AA'],
        'Price': [22000, 25000, 27000, 35000],
        'Year': [2015, 2013, 2018, 2018]
        }
 
df = pd.DataFrame(data)

print(df)

And if you run the above Python code, you’ll get the following DataFrame:

  Brand  Price  Year
0    HH  22000  2015
1    TT  25000  2013
2    FF  27000  2018
3    AA  35000  2018

Next, you’ll see how to sort that DataFrame using 4 different examples.

Example 1: Sort Pandas DataFrame in an ascending order

Let’s say that you want to sort the DataFrame, such that the Brand will be displayed in an ascending order. In that case, you’ll need to add the following syntax to the code:

df.sort_values(by=['Brand'], inplace=True)

Note that unless specified, the values will be sorted in an ascending order by default.

The full Python code would look like this:

import pandas as pd
 
data = {'Brand': ['HH', 'TT', 'FF', 'AA'],
        'Price': [22000, 25000, 27000, 35000],
        'Year': [2015, 2013, 2018, 2018]
        }
 
df = pd.DataFrame(data)

# Sort Brand in ascending order
df.sort_values(by=['Brand'], inplace=True)

print(df)

When you run the code, you’ll notice that the Brand will indeed get sorted in an ascending order, where ‘AA’ would be the first record, while ‘TT’ would be the last:

  Brand  Price  Year
3    AA  35000  2018
2    FF  27000  2018
0    HH  22000  2015
1    TT  25000  2013

Example 2: Sort Pandas DataFrame in a descending order

Alternatively, you can sort the Brand column in a descending order. To do that, simply add the condition of ascending=False in the following manner:

df.sort_values(by=['Brand'], inplace=True, ascending=False)

And the complete Python code would be:

import pandas as pd
 
data = {'Brand': ['HH', 'TT', 'FF', 'AA'],
        'Price': [22000, 25000, 27000, 35000],
        'Year': [2015, 2013, 2018, 2018]
        }
 
df = pd.DataFrame(data)

# Sort Brand in descending order
df.sort_values(by=['Brand'], inplace=True, ascending=False)

print(df)

You’ll now notice that ‘TT’ would be the first record, while ‘AA’ would be the last (as you would expect to get when applying a descending order for our sample):

  Brand  Price  Year
1    TT  25000  2013
0    HH  22000  2015
2    FF  27000  2018
3    AA  35000  2018

Example 3: Sort by multiple columns – case 1

But what if you want to sort by multiple columns?

In that case, you may use the following template to sort by multiple columns:

df.sort_values(by=['First Column', 'Second Column',...], inplace=True)

Suppose that you want to sort by both the ‘Year’ and the ‘Price.’ Since you have two records where the Year is 2018 (i.e., for the ‘FF’ and the ‘AA’ brands), then sorting by a second column – the ‘Price’ column –  would be useful:

 df.sort_values(by=['Year', 'Price'], inplace=True)

Here is the Python code that you may use:

import pandas as pd
 
data = {'Brand': ['HH', 'TT', 'FF', 'AA'],
        'Price': [22000, 25000, 27000, 35000],
        'Year': [2015, 2013, 2018, 2018]
        }
 
df = pd.DataFrame(data)

# Sort by multiple columns: Year and Price
df.sort_values(by=['Year', 'Price'], inplace=True)

print(df)

Notice that all the records are now sorted by both the year and the price in an ascending order, so ‘FF’ would appear before ‘AA’:

  Brand  Price  Year
1    TT  25000  2013
0    HH  22000  2015
2    FF  27000  2018
3    AA  35000  2018

Also note that the ‘Year’ column takes the priority when performing the sorting, as it was placed in the df.sort_values before the ‘Price’ column.

Example 4: Sort by multiple columns – case 2

Finally, let’s sort by the columns of ‘Year’ and ‘Brand’ as follows:

df.sort_values(by=['Year', 'Brand'], inplace=True)

The complete Python code would look like this:

import pandas as pd
 
data = {'Brand': ['HH', 'TT', 'FF', 'AA'],
        'Price': [22000, 25000, 27000, 35000],
        'Year': [2015, 2013, 2018, 2018]
        }
 
df = pd.DataFrame(data)

# Sort by multiple columns: Year and Brand
df.sort_values(by=['Year', 'Brand'], inplace=True)

print(df)

You’ll now see that all the records are sorted by both the year and the brand in an ascending order, so this time ‘AA’ would appear prior to ‘FF’:

  Brand  Price  Year
1    TT  25000  2013
0    HH  22000  2015
3    AA  35000  2018
2    FF  27000  2018

You may want to check the Pandas Documentation to learn more about sorting values in Pandas DataFrame.

Leave a Comment