To filter Pandas DataFrame based on the index:
df_filter = df.filter(items=[index to keep], axis=0)
Example
Suppose that you created the DataFrame below:
import pandas as pd
data = {
"Product": ["Computer", "Printer", "Monitor", "Desk", "Phone", "Tablet", "Scanner"],
"Price": [900, 200, 300, 450, 150, 250, 150],
}
df = pd.DataFrame(data)
print(df)
As you can see in yellow, the index values are located on the left, starting from 0 and ending at 6:
Product Price
0 Computer 900
1 Printer 200
2 Monitor 300
3 Desk 450
4 Phone 150
5 Tablet 250
6 Scanner 150
Example 1: Filter Pandas DataFrame Based on the Index
To select the row with the index of 2 (for the “Monitor’” product) while filtering out all the other rows:
df_filter = df.filter(items=[2], axis=0)
So the complete Python code is:
import pandas as pd
data = {
"Product": ["Computer", "Printer", "Monitor", "Desk", "Phone", "Tablet", "Scanner"],
"Price": [900, 200, 300, 450, 150, 250, 150],
}
df = pd.DataFrame(data)
df_filter = df.filter(items=[2], axis=0)
print(df_filter)
Notice that only the row with the index of 2 is kept, while all the other rows are filtered out:
Product Price
2 Monitor 300
What if you’d like to keep multiple rows based on the index values?
For example, to keep the “Monitor” (with an index of 2) and the “Tablet” (with an index of 5), set items = [2, 5]:
import pandas as pd
data = {
"Product": ["Computer", "Printer", "Monitor", "Desk", "Phone", "Tablet", "Scanner"],
"Price": [900, 200, 300, 450, 150, 250, 150],
}
df = pd.DataFrame(data)
df_filter = df.filter(items=[2, 5], axis=0)
print(df_filter)
You’ll now get the two items:
Product Price
2 Monitor 300
5 Tablet 250
Example 2: Filter Pandas DataFrame for a Non-numeric Index
Next, assign a non-numeric index to the DataFrame:
index=["Item_A", "Item_B", "Item_C", "Item_D", "Item_E", "Item_F", "Item_G"]
The new DataFrame would look as follows:
import pandas as pd
data = {
"Product": ["Computer", "Printer", "Monitor", "Desk", "Phone", "Tablet", "Scanner"],
"Price": [900, 200, 300, 450, 150, 250, 150],
}
df = pd.DataFrame(
data, index=["Item_A", "Item_B", "Item_C", "Item_D", "Item_E", "Item_F", "Item_G"]
)
print(df)
You’ll now see the non-numeric index on the left side:
Product Price
Item_A Computer 900
Item_B Printer 200
Item_C Monitor 300
Item_D Desk 450
Item_E Phone 150
Item_F Tablet 250
Item_G Scanner 150
To keep the row with the index of “Item_C” while filtering out all the other items:
import pandas as pd
data = {
"Product": ["Computer", "Printer", "Monitor", "Desk", "Phone", "Tablet", "Scanner"],
"Price": [900, 200, 300, 450, 150, 250, 150],
}
df = pd.DataFrame(
data, index=["Item_A", "Item_B", "Item_C", "Item_D", "Item_E", "Item_F", "Item_G"]
)
df_filter = df.filter(items=["Item_C"], axis=0)
print(df_filter)
As you can see, only the row with the index of “Item_C” is kept:
Product Price
Item_C Monitor 300
Alternatively, you can select the same row with the index of “Item_C” by setting like = “C” (as the letter “C” doesn’t appear in any of the other index values):
import pandas as pd
data = {
"Product": ["Computer", "Printer", "Monitor", "Desk", "Phone", "Tablet", "Scanner"],
"Price": [900, 200, 300, 450, 150, 250, 150],
}
df = pd.DataFrame(
data, index=["Item_A", "Item_B", "Item_C", "Item_D", "Item_E", "Item_F", "Item_G"]
)
df_filter = df.filter(like="C", axis=0)
print(df_filter)
The result is the same row with the index of “Item_C”:
Product Price
Item_C Monitor 300
You can learn more about df.filter by checking the Pandas Documentation.