Depending on your needs, you may use either of the following methods to replace values in Pandas DataFrame:
(1) Replace a single value with a new value for an individual DataFrame column:
df['column name'] = df['column name'].replace(['old value'],'new value')
(2) Replace multiple values with a new value for an individual DataFrame column:
df['column name'] = df['column name'].replace(['1st old value','2nd old value',...],'new value')
(3) Replace multiple values with multiple new values for an individual DataFrame column:
df['column name'] = df['column name'].replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])
(4) Replace a single value with a new value for an entire DataFrame:
df = df.replace(['old value'],'new value')
In the next section, you’ll see how to apply the above templates in practice.
Steps to Replace Values in Pandas DataFrame
Step 1: Gather your Data
To begin, gather your data with the values that you’d like to replace.
For example, I gathered the following data about different colors:
first_set | second_set |
Green | Yellow |
Green | Yellow |
Green | Yellow |
Blue | White |
Blue | White |
Red | Blue |
Red | Blue |
Red | Blue |
You’ll later see how to replace some of the colors in the above table.
Step 2: Create the DataFrame
Next, create the DataFrame based on the data that was captured in step 1:
import pandas as pd colors = {'first_set': ['Green','Green','Green','Blue','Blue','Red','Red','Red'], 'second_set': ['Yellow','Yellow','Yellow','White','White','Blue','Blue','Blue'] } df = pd.DataFrame(colors, columns= ['first_set','second_set']) print (df)
Run the code in Python, and you’ll see the following DataFrame:
Step 3: Replace Values in Pandas DataFrame
Let’s now replace all the ‘Blue’ values with ‘Green’ values under the ‘first_set’ column.
You may then use the following template to accomplish this goal:
df['column name'] = df['column name'].replace(['old value'],'new value')
And this is the complete Python code for our example:
import pandas as pd colors = {'first_set': ['Green','Green','Green','Blue','Blue','Red','Red','Red'], 'second_set': ['Yellow','Yellow','Yellow','White','White','Blue','Blue','Blue'] } df = pd.DataFrame(colors, columns= ['first_set','second_set']) df['first_set'] = df['first_set'].replace(['Blue'],'Green') print (df)
Run the code, and you’ll notice that all the ‘Blue’ values got replaced with ‘Green’ values under the first column:
But what if you want to replace multiple values with a new value for an individual DataFrame column?
If that’s the case, you may use this template:
df['column name'] = df['column name'].replace(['1st old value','2nd old value',...],'new value')
Let’s say that you’d like to replace the ‘Blue’ and ‘Red’ colors with a ‘Green’ color under the ‘first_set’ column.
This it the syntax that you may use in Python:
import pandas as pd colors = {'first_set': ['Green','Green','Green','Blue','Blue','Red','Red','Red'], 'second_set': ['Yellow','Yellow','Yellow','White','White','Blue','Blue','Blue'] } df = pd.DataFrame(colors, columns= ['first_set','second_set']) df['first_set'] = df['first_set'].replace(['Blue','Red'],'Green') print (df)
You’ll now notice that both the ‘Blue’ and ‘Red’ colors got replaced with a ‘Green’ color under the first column:
Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column.
In that case, you may use this template:
df['column name'] = df['column name'].replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])
Let’s say that you want to replace:
- The ‘Blue’ color with a ‘Green’ color; and
- The ‘Red’ color with a ‘White’ color
You can then apply this code in Python:
import pandas as pd colors = {'first_set': ['Green','Green','Green','Blue','Blue','Red','Red','Red'], 'second_set': ['Yellow','Yellow','Yellow','White','White','Blue','Blue','Blue'] } df = pd.DataFrame(colors, columns= ['first_set','second_set']) df['first_set'] = df['first_set'].replace(['Blue','Red'],['Green','White']) print (df)
You’ll notice that the ‘Blue’ became ‘Green’ and the ‘Red’ became ‘White’ under the first column:
So far you have seen how to replace the values under an individual DataFrame column.
But what if you’d like to replace a value across the entire DataFrame?
In that case, you may use the following template:
df = df.replace(['old value'],'new value')
For example, you may run the code below in order to replace the ‘Blue’ color with a ‘Green’ color throughout the entire DataFrame:
import pandas as pd colors = {'first_set': ['Green','Green','Green','Blue','Blue','Red','Red','Red'], 'second_set': ['Yellow','Yellow','Yellow','White','White','Blue','Blue','Blue'] } df = pd.DataFrame(colors, columns= ['first_set','second_set']) df = df.replace(['Blue'],'Green') print (df)
Once you run the code, you’ll see that ‘Blue’ became ‘Green’ across all the columns in the DataFrame:
And if you decided, for example, to replace two colors, such as ‘Blue’ and ‘Red’ into ‘Green,’ then you may use this syntax:
import pandas as pd colors = {'first_set': ['Green','Green','Green','Blue','Blue','Red','Red','Red'], 'second_set': ['Yellow','Yellow','Yellow','White','White','Blue','Blue','Blue'] } df = pd.DataFrame(colors, columns= ['first_set','second_set']) df = df.replace(['Blue','Red'],'Green') print (df)
Both the ‘Blue’ and ‘Red’ colors would now get replaced with ‘Green’ across the entire DataFrame: