Concatenate Column Values in Pandas DataFrame

Here are 2 ways to concatenate column values in Pandas DataFrame:

(1) For strings only:

df["new_column_name"] = df["1st_column_name"] + df["2nd_column_name"] + ...

(2) For a mix of strings and integers (or just integers):

df["new_column_name"] = df["1st_column_name"].map(str) + df["2nd_column_name"].map(str) + ...

Examples

Example 1: Concatenating values under a single DataFrame

Assume that you have the following dataset which contains 3 columns:

daymonthyear
1Jun2019
2Jul2020
3Aug2021
4Sep2022
5Oct2023

The goal is to concatenate the column values as captured below:

day-month-year

To begin, create a DataFrame to capture the above values in Python:

import pandas as pd

data = {
"day": [1, 2, 3, 4, 5],
"month": ["Jun", "Jul", "Aug", "Sep", "Oct"],
"year": [2019, 2020, 2021, 2022, 2023],
}

df = pd.DataFrame(data)

print(df)

Run the code, and you’ll get the following DataFrame:

   day  month  year
0    1    Jun  2019
1    2    Jul  2020
2    3    Aug  2021
3    4    Sep  2022
4    5    Oct  2023

Next, add the following syntax to perform the concatenation (using “-” as a separator):

df["full_date"] = df["day"].map(str) + "-" + df["month"].map(str) + "-" + df["year"].map(str)

So the complete Python code would look like this:

import pandas as pd

data = {
"day": [1, 2, 3, 4, 5],
"month": ["Jun", "Jul", "Aug", "Sep", "Oct"],
"year": [2019, 2020, 2021, 2022, 2023],
}

df = pd.DataFrame(data)

df["full_date"] = df["day"].map(str) + "-" + df["month"].map(str) + "-" + df["year"].map(str)

print(df)

Run the code, and you’ll get the concatenated full date (as highlighted in yellow):

   day month  year   full_date
0    1   Jun  2019  1-Jun-2019
1    2   Jul  2020  2-Jul-2020
2    3   Aug  2021  3-Aug-2021
3    4   Sep  2022  4-Sep-2022
4    5   Oct  2023  5-Oct-2023

Example 2: Concatenating column values from two separate DataFrames

In the previous example, you saw how to create the first DataFrame based on this data:

daymonthyear
1Jun2019
2Jul2020
3Aug2021
4Sep2022
5Oct2023

Now create a second DataFrame based on this data:

unemployment_rateinterest_rate
5.51.75
51.5
5.21.25
5.11.5
4.92

The goal is to concatenate the values from the two DataFrames as follows:

day-month-year: unemployment_rate;  interest_rate

To accomplish this goal, you may apply the following code:

import pandas as pd

data1 = {
"day": [1, 2, 3, 4, 5],
"month": ["Jun", "Jul", "Aug", "Sep", "Oct"],
"year": [2019, 2020, 2021, 2022, 2023],
}

df1 = pd.DataFrame(data1)

data2 = {
"unemployment_rate": [5.5, 5, 5.2, 5.1, 4.9],
"interest_rate": [1.75, 1.5, 1.25, 1.5, 2],
}

df2 = pd.DataFrame(data2)

combined_values = (
df1["day"].map(str)
+ "-"
+ df1["month"].map(str)
+ "-"
+ df1["year"].map(str)
+ ": "
+ "Unemployment: "
+ df2["unemployment_rate"].map(str)
+ "; "
+ "Interest: "
+ df2["interest_rate"].map(str)
)

print(combined_values)

The result:

0    1-Jun-2019: Unemployment: 5.5; Interest: 1.75
1     2-Jul-2020: Unemployment: 5.0; Interest: 1.5
2    3-Aug-2021: Unemployment: 5.2; Interest: 1.25
3     4-Sep-2022: Unemployment: 5.1; Interest: 1.5
4     5-Oct-2023: Unemployment: 4.9; Interest: 2.0

Example 3: Concatenating values, and then finding the Max

In the last example, you’ll see how to concatenate 2 DataFrames (which contain only numeric values), and then find the maximum value:

import pandas as pd

data1 = {"set1": [55, 22, 11, 77, 33]}
df1 = pd.DataFrame(data1)

data2 = {"set2": [23, 45, 21, 73, 48]}
df2 = pd.DataFrame(data2)

concatenated = df1["set1"].map(str) + df2["set2"].map(str)

combined = pd.DataFrame(concatenated)
max_value = combined.max()

print(max_value)

And the result that you’ll get is 7773, which is indeed the maximum value:

7773

Leave a Comment