How to Set Column as Index in Pandas DataFrame

Depending on your needs, you may use either of the two approaches below to set column as index in Pandas DataFrame:

(1) Set a single column as Index:

df.set_index('column')

(2) Set multiple columns as MultiIndex:

df.set_index(['column_1','column_2',...])

Next, you’ll see the steps to apply the above approaches using simple examples.

Steps to Set Column as Index in Pandas DataFrame

Step 1: Create the DataFrame

To start with a simple example, let’s say that you’d like to create a DataFrame given the following data:

Product Brand Price
AAA A 200
BBB B 700
CCC C 400
DDD D 1200
EEE E 900

You may then run the code below to create the DataFrame:

import pandas as pd

data = {'Product': ['AAA','BBB','CCC','DDD','EEE'],
          'Brand': ['A','B','C','D','E'],
          'Price': [200,700,400,1200,900]
        }

df = pd.DataFrame(data, columns = ['Product','Brand','Price'])

print(df)

You’ll now get the following DataFrame:

  Product   Brand   Price
0     AAA       A     200
1     BBB       B     700
2     CCC       C     400
3     DDD       D    1200
4     EEE       E     900

As you may see in yellow, the current index contains sequential numeric values (staring from zero). Next, you’ll see how to change that default index.

Step 2: Set a single column as Index in Pandas DataFrame

You may use the following approach in order to set a single column as the index in the DataFrame:

df.set_index('column')

For example, let’s say that you’d like to set the ‘Product‘ column as the index.
In that case, you may apply the code below to accomplish this goal:

import pandas as pd

data = {'Product': ['AAA','BBB','CCC','DDD','EEE'],
          'Brand': ['A','B','C','D','E'],
          'Price': [200,700,400,1200,900]
        }

df = pd.DataFrame(data, columns = ['Product','Brand','Price'])

df = df.set_index('Product')

print(df)

As you can see, the ‘Product’ column would now become the new index:

        Brand  Price
Product             
AAA         A    200
BBB         B    700
CCC         C    400
DDD         D   1200
EEE         E    900

Step 3 (optional): Set multiple columns as MultiIndex:

Alternatively, you may use this approach to set multiple columns as the MultiIndex:

df.set_index(['column_1','column_2',...])

For instance, let’s say that you’d like to set both the ‘Product‘ and the ‘Brand‘ columns as the MultiIndex.
In that case, you may run this code:

import pandas as pd

data = {'Product': ['AAA','BBB','CCC','DDD','EEE'],
          'Brand': ['A','B','C','D','E'],
          'Price': [200,700,400,1200,900]
        }

df = pd.DataFrame(data, columns = ['Product','Brand','Price'])

df = df.set_index(['Product','Brand'])

print(df)

As you may observe, both the ‘Product’ and the ‘Brand’ columns became the new MultiIndex:

               Price
Product Brand       
AAA     A        200
BBB     B        700
CCC     C        400
DDD     D       1200
EEE     E        900

You may also want to check the Pandas Documentation for further information about df.set_index.