Depending on your needs, you may use either of the following approaches to replace values in Pandas DataFrame:
(1) Replace a single value with a new value for an individual DataFrame column:
df['column name'] = df['column name'].replace(['old value'], 'new value')
(2) Replace multiple values with a new value for an individual DataFrame column:
df['column name'] = df['column name'].replace(['1st old value', '2nd old value', ...], 'new value')
(3) Replace multiple values with multiple new values for an individual DataFrame column:
df['column name'] = df['column name'].replace(['1st old value', '2nd old value', ...], ['1st new value', '2nd new value', ...])
(4) Replace a single value with a new value for an entire DataFrame:
df = df.replace(['old value'], 'new value')
In the next section, you’ll see how to apply the above templates in practice.
Steps to Replace Values in Pandas DataFrame
Step 1: Gather your Data
To begin, gather your data with the values that you’d like to replace.
For example, let’s gather the following data about different colors:
first_set | second_set |
Green | Yellow |
Green | Yellow |
Green | Yellow |
Blue | White |
Blue | White |
Red | Blue |
Red | Blue |
Red | Blue |
You’ll later see how to replace some of the colors in the above table.
Step 2: Create the DataFrame
Next, create the DataFrame based on the data that was captured in step 1:
import pandas as pd colors = {'first_set': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'second_set': ['Yellow', 'Yellow', 'Yellow', 'White', 'White', 'Blue', 'Blue', 'Blue'] } df = pd.DataFrame(colors) print(df)
Run the code in Python, and you’ll see the following DataFrame:
first_set second_set
0 Green Yellow
1 Green Yellow
2 Green Yellow
3 Blue White
4 Blue White
5 Red Blue
6 Red Blue
7 Red Blue
Step 3: Replace Values in Pandas DataFrame
Let’s now replace all the ‘Blue’ values with the ‘Green’ values under the ‘first_set’ column.
You may then use the following template to accomplish this goal:
df['column name'] = df['column name'].replace(['old value'], 'new value')
And this is the complete Python code for our example:
import pandas as pd colors = {'first_set': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'second_set': ['Yellow', 'Yellow', 'Yellow', 'White', 'White', 'Blue', 'Blue', 'Blue'] } df = pd.DataFrame(colors) df['first_set'] = df['first_set'].replace(['Blue'], 'Green') print(df)
Run the code, and you’ll notice that all the ‘Blue’ values got replaced with the ‘Green’ values under the first column:
first_set second_set
0 Green Yellow
1 Green Yellow
2 Green Yellow
3 Green White
4 Green White
5 Red Blue
6 Red Blue
7 Red Blue
But what if you want to replace multiple values with a new value for an individual DataFrame column?
If that’s the case, you may use this template:
df['column name'] = df['column name'].replace(['1st old value', '2nd old value', ...], 'new value')
Let’s say that you’d like to replace the ‘Blue’ and ‘Red’ colors with a ‘Green’ color under the ‘first_set’ column.
This is the syntax that you may use in Python:
import pandas as pd colors = {'first_set': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'second_set': ['Yellow', 'Yellow', 'Yellow', 'White', 'White', 'Blue', 'Blue', 'Blue'] } df = pd.DataFrame(colors) df['first_set'] = df['first_set'].replace(['Blue', 'Red'], 'Green') print(df)
You’ll now notice that both the ‘Blue’ and ‘Red’ colors got replaced with a ‘Green’ color under the first column:
first_set second_set
0 Green Yellow
1 Green Yellow
2 Green Yellow
3 Green White
4 Green White
5 Green Blue
6 Green Blue
7 Green Blue
Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column.
In that case, you may use this template:
df['column name'] = df['column name'].replace(['1st old value', '2nd old value', ...], ['1st new value', '2nd new value', ...])
Let’s say that you want to replace:
- The ‘Blue’ color with a ‘Green’ color; and
- The ‘Red’ color with a ‘White’ color
You can then apply this code in Python:
import pandas as pd colors = {'first_set': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'second_set': ['Yellow', 'Yellow', 'Yellow', 'White', 'White', 'Blue', 'Blue', 'Blue'] } df = pd.DataFrame(colors) df['first_set'] = df['first_set'].replace(['Blue', 'Red'], ['Green', 'White']) print(df)
You’ll notice that the ‘Blue’ became ‘Green’ and the ‘Red’ became ‘White’ under the first column:
first_set second_set
0 Green Yellow
1 Green Yellow
2 Green Yellow
3 Green White
4 Green White
5 White Blue
6 White Blue
7 White Blue
So far you have seen how to replace values under an individual DataFrame column.
But what if you’d like to replace a value across the entire DataFrame?
In that case, you may use the following template:
df = df.replace(['old value'], 'new value')
For example, you may run the code below in order to replace the ‘Blue’ color with a ‘Green’ color throughout the entire DataFrame:
import pandas as pd colors = {'first_set': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'second_set': ['Yellow', 'Yellow', 'Yellow', 'White', 'White', 'Blue', 'Blue', 'Blue'] } df = pd.DataFrame(colors) df = df.replace(['Blue'], 'Green') print(df)
Once you run the code, you’ll see that ‘Blue’ became ‘Green’ across all the columns in the DataFrame:
first_set second_set
0 Green Yellow
1 Green Yellow
2 Green Yellow
3 Green White
4 Green White
5 Red Green
6 Red Green
7 Red Green
And if you decide, for example, to replace two colors, such as ‘Blue’ and ‘Red’ into ‘Green,’ then you may use this syntax:
import pandas as pd colors = {'first_set': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'second_set': ['Yellow', 'Yellow', 'Yellow', 'White', 'White', 'Blue', 'Blue', 'Blue'] } df = pd.DataFrame(colors) df = df.replace(['Blue', 'Red'], 'Green') print(df)
Both the ‘Blue’ and ‘Red’ colors would be replaced with ‘Green’ across the entire DataFrame:
first_set second_set
0 Green Yellow
1 Green Yellow
2 Green Yellow
3 Green White
4 Green White
5 Green Green
6 Green Green
7 Green Green