How to Filter Pandas DataFrame Based on Index

Here is the syntax that you can use to filter Pandas DataFrame based on the index:

df = df.filter(like = 'index to keep', axis=0)

Let’s review an example to see how to apply the above syntax in practice.

The Example

Suppose that you created the DataFrame below:

import pandas as pd

data = {'Product': ['Computer','Printer','Monitor','Desk','Phone','Tablet','Scanner'],
          'Price': [900,200,300,450,150,250,150]
        }

df = pd.DataFrame(data, columns = ['Product','Price'])

print(df)

As you can see in red, the index values are located on the left, starting from 0 and ending at 6:

How to Filter Pandas DataFrame Based on Index

Filter Pandas DataFrame Based on the Index

Let’s say that you want to select the row with the index of 2 (for the ‘Monitor’ product) while filtering out all the other rows. In that case, simply add the following syntax to the original code:

df = df.filter(like = '2', axis=0)

So the complete Python code to keep the row with the index of 2 is:

import pandas as pd

data = {'Product': ['Computer','Printer','Monitor','Desk','Phone','Tablet','Scanner'],
          'Price': [900,200,300,450,150,250,150]
        }

df = pd.DataFrame(data, columns = ['Product','Price'])

df = df.filter(like = '2', axis=0)

print(df)

Run the code, and you’ll notice that only the row with the index of 2 is kept, while all the other rows are filtered out:

Row in a DataFrame

Filter Pandas DataFrame for a Non-numeric Index

Say that you assigned a non-numeric index to the DataFrame:

index = ['Item_A','Item_B','Item_C','Item_D','Item_E','Item_F','Item_G']

The new DataFrame would look as follows:

import pandas as pd

data = {'Product': ['Computer','Printer','Monitor','Desk','Phone','Tablet','Scanner'],
          'Price': [900,200,300,450,150,250,150]
        }

df = pd.DataFrame(data, columns = ['Product','Price'], index = ['Item_A','Item_B','Item_C','Item_D','Item_E','Item_F','Item_G'])

print(df)

You’ll now see the non-numeric index on the left side:

Filter Pandas DataFrame Based on Index

For illustration purposes, let’s keep the row with the index of ‘Item_C‘ while filtering out all the other items:

import pandas as pd

data = {'Product': ['Computer','Printer','Monitor','Desk','Phone','Tablet','Scanner'],
          'Price': [900,200,300,450,150,250,150]
        }

df = pd.DataFrame(data, columns = ['Product','Price'], index = ['Item_A','Item_B','Item_C','Item_D','Item_E','Item_F','Item_G'])

df = df.filter(like = 'Item_C', axis=0)

print(df)

You’ll now get only the row with the index of Item_C:

Select a row

Alternatively, you could select the same row with the index of ‘Item_C’ by setting like = ‘C’ (as the letter ‘C’ doesn’t appear in any of the other index values):

import pandas as pd

data = {'Product': ['Computer','Printer','Monitor','Desk','Phone','Tablet','Scanner'],
          'Price': [900,200,300,450,150,250,150]
        }

df = pd.DataFrame(data, columns = ['Product','Price'], index = ['Item_A','Item_B','Item_C','Item_D','Item_E','Item_F','Item_G'])

df = df.filter(like = 'C', axis=0)

print(df)

You’ll get the same row with the index of Item_C:

Select a row in a dataframe

Select Multiple Index values

What if you’d like to select multiple index values?

For example, let’s modify the index values to the following:

index = ['Item_AA','Item_BB','Item_CC','Item_CD','Item_CCC','Item_CA','Item_CB']

So the new DataFrame would look like this:

import pandas as pd

data = {'Product': ['Computer','Printer','Monitor','Desk','Phone','Tablet','Scanner'],
          'Price': [900,200,300,450,150,250,150]
        }

df = pd.DataFrame(data, columns = ['Product','Price'], index = ['Item_AA','Item_BB','Item_CC','Item_CD','Item_CCC','Item_CA','Item_CB'])

print(df)

Run the code, and you’ll see the new index on the left side of the DataFrame:

How to Filter Pandas DataFrame Based on Index

If, for example, you set like = ‘Item_C’ then you’ll get multiple items (as the string of Item_C appears in 5 locations):

import pandas as pd

data = {'Product': ['Computer','Printer','Monitor','Desk','Phone','Tablet','Scanner'],
          'Price': [900,200,300,450,150,250,150]
        }

df = pd.DataFrame(data, columns = ['Product','Price'], index = ['Item_AA','Item_BB','Item_CC','Item_CD','Item_CCC','Item_CA','Item_CB'])

df = df.filter(like = 'Item_C', axis=0)

print(df)

As you can see, there are 5 index values where the string of ‘Item_C’ can be found:

Filter Pandas DataFrame Based on Index

Optionally, let’s set like = ‘CC’ as follows:

import pandas as pd

data = {'Product': ['Computer','Printer','Monitor','Desk','Phone','Tablet','Scanner'],
          'Price': [900,200,300,450,150,250,150]
        }

df = pd.DataFrame(data, columns = ['Product','Price'], index = ['Item_AA','Item_BB','Item_CC','Item_CD','Item_CCC','Item_CA','Item_CB'])

df = df.filter(like = 'CC', axis=0)

print(df)

In that case, only two items (which contain ‘CC’) will be selected:

How to Filter Pandas DataFrame Based on Index

You can learn more about df.filter by checking the Pandas Documentation.