How to Set Column as Index in a pandas DataFrame
In this tutorial, you will learn how to set a column as index in a DataFrame.
TLDR solution
# single column as index
df.set_index('column', inplace=True)
# Multiple column as index
df.set_index(['column_a', 'column_b'], inplace=True)
Step-by-Step Example
Suppose, you have a DataFrame on caught fish:
import pandas as pd
data = {'fish': ['salmon', 'pufferfish', 'shark'],
'boat_id': [1, 2, 9],
'count': [100, 10, 1],
}
df = pd.DataFrame(data)
print(df)
fish boat_id count
0 salmon 1 100
1 pufferfish 2 10
2 shark 9 1
Set a Column as Index
You can boat_id as index using the set_index method.
To replace the existing index set the inplace option to True.
df.set_index('boat_id', inplace=True)
print(df)
fish count
boat_id
1 salmon 100
2 pufferfish 10
9 shark 1
Set Multiple Column as Index
To set multiple columns as index in a DataFrame, simply input a list of columns:
import pandas as pd
data = {'fish': ['salmon', 'pufferfish', 'shark'],
'boat_id': [1, 2, 9],
'count': [100, 10, 1],
}
df = pd.DataFrame(data)
df.set_index(['boat_id', 'fish'], inplace=True)
print(df)
count
boat_id fish
1 salmon 100
2 pufferfish 10
9 shark 1
That's it! You just learned how to set columns as index in a pandas DataFrame.