How to to Replace Values in a DataFrame in R

Here is the syntax to replace values in a DataFrame in R:

(1) Replace a value across the entire DataFrame:

df[df == "Old Value"] <- "New Value"

(2) Replace a value under a single DataFrame column:

df["Column Name"][df["Column Name"] == "Old Value"] <- "New Value"

Next, you’ll see 4 scenarios that will describe how to:

  1. Replace a value across the entire DataFrame
  2. Replace multiple values
  3. Replace a value under a single DataFrame column
  4. Deal with factors to avoid the “invalid factor level” warning

Scenario 1: Replace a value across the entire DataFrame in R

To start with a simple example, let’s create a DataFrame in R that contains 4 columns:

df <- data.frame(Group_A =  c(11,11,11,222,222,222,33,33),
                 Group_B =  c(444,444,55,55,55,55,11,11),
                 Group_C =  c("Blue","Blue","Blue","Green","Green","Red","Red","Red"),
                 Group_D =  c("Yellow","Yellow","Yellow","White","White","Blue","Blue","Blue")
                 )
print(df)

Run the code, and you’ll get the following DataFrame:

How to to Replace Values in a DataFrame in R

Suppose that you’d like to replace ‘11‘ with ‘77‘ across the entire DataFrame.

In that case, you’ll need to add the following syntax to the code:

df[df == 11] <- 77

So the complete code to perform the replacement is:

df <- data.frame(Group_A =  c(11,11,11,222,222,222,33,33),
                 Group_B =  c(444,444,55,55,55,55,11,11),
                 Group_C =  c("Blue","Blue","Blue","Green","Green","Red","Red","Red"),
                 Group_D =  c("Yellow","Yellow","Yellow","White","White","Blue","Blue","Blue")
                 )

df[df == 11] <- 77

print(df)

As you can see, ’11’ was replaced with ’77’ across the entire DataFrame:

Replace Values in a DataFrame in R

Scenario 2: Replace multiple values in a DataFrame in R

At times, you may need to replace multiple values in your DataFrame.

For example, let’s replace:

  • ’11’ with ’77’
  • ’33’ with ’77’

In that case, you can use pipe (“|”) to perform the replacement as follows:

df <- data.frame(Group_A =  c(11,11,11,222,222,222,33,33),
                 Group_B =  c(444,444,55,55,55,55,11,11),
                 Group_C =  c("Blue","Blue","Blue","Green","Green","Red","Red","Red"),
                 Group_D =  c("Yellow","Yellow","Yellow","White","White","Blue","Blue","Blue")
                 )

df[df == 11 | df == 33] <- 77

print(df)

Here is the result:

How to to Replace Values in a DataFrame in R

Scenario 3: Replace a value under a single DataFrame column

What if you’d like to replace a value under a single DataFrame column?

For instance, let’s replace ’11’ with ’77’ under the ‘Group_B‘ column.

To accomplish this goal, you’ll need to apply the following syntax:

df["Group_B"][df["Group_B"] == 11] <- 77

Therefore, the complete code to execute the replacement is:

df <- data.frame(Group_A =  c(11,11,11,222,222,222,33,33),
                 Group_B =  c(444,444,55,55,55,55,11,11),
                 Group_C =  c("Blue","Blue","Blue","Green","Green","Red","Red","Red"),
                 Group_D =  c("Yellow","Yellow","Yellow","White","White","Blue","Blue","Blue")
                 )

df["Group_B"][df["Group_B"] == 11] <- 77

print(df)

The ’11’ value will be replaced with ’77’ only under the ‘Group_B’ column:

Table

Scenario 4: Dealing with factors

So far, you have seen how to replace numeric values.

The data type of the last two columns is factor rather than numeric.

Let’s say that you’d like to replace the ‘Blue‘ color with ‘Green‘ using the code below:

df <- data.frame(Group_A =  c(11,11,11,222,222,222,33,33),
                 Group_B =  c(444,444,55,55,55,55,11,11),
                 Group_C =  c("Blue","Blue","Blue","Green","Green","Red","Red","Red"),
                 Group_D =  c("Yellow","Yellow","Yellow","White","White","Blue","Blue","Blue")
                 )

df[df == "Blue"] <- "Green"

print(df)

If you run the above code, you’ll get the following warning message:

Warning message in `[<-.factor`(`*tmp*`, thisvar, value = “Green”):
“invalid factor level, NA generated”

To avoid this message, you may add “,stringsAsFactors = FALSE” at the end of your DataFrame:

df <- data.frame(Group_A =  c(11,11,11,222,222,222,33,33),
                 Group_B =  c(444,444,55,55,55,55,11,11),
                 Group_C =  c("Blue","Blue","Blue","Green","Green","Red","Red","Red"),
                 Group_D =  c("Yellow","Yellow","Yellow","White","White","Blue","Blue","Blue")
                 ,stringsAsFactors = FALSE
                 )

df[df == "Blue"] <- "Green"

print(df)

You’ll now be able to replace all the ‘Blue’ values with ‘Green’ values without getting the previous warning message:

How to to Replace Values in a DataFrame in R

Similarly, you can replace the ‘Blue’ values with ‘Green’ values under a single DataFrame column, such as the ‘Group_D’ column:

df <- data.frame(Group_A =  c(11,11,11,222,222,222,33,33),
                 Group_B =  c(444,444,55,55,55,55,11,11),
                 Group_C =  c("Blue","Blue","Blue","Green","Green","Red","Red","Red"),
                 Group_D =  c("Yellow","Yellow","Yellow","White","White","Blue","Blue","Blue")
                 ,stringsAsFactors = FALSE
                 )

df["Group_D"][df["Group_D"] == "Blue"] <- "Green"

print(df)

Here is the result:

Table with data