Here are 2 ways to concatenate column values in Pandas DataFrame:
(1) For strings only:
df["new_column_name"] = df["1st_column_name"] + df["2nd_column_name"] + ...
(2) For a mix of strings and integers (or just integers):
df["new_column_name"] = df["1st_column_name"].map(str) + df["2nd_column_name"].map(str) + ...
Examples
Example 1: Concatenating values under a single DataFrame
Assume that you have the following dataset which contains 3 columns:
day | month | year |
1 | Jun | 2019 |
2 | Jul | 2020 |
3 | Aug | 2021 |
4 | Sep | 2022 |
5 | Oct | 2023 |
The goal is to concatenate the column values as captured below:
day-month-year
To begin, create a DataFrame to capture the above values in Python:
import pandas as pd
data = {
"day": [1, 2, 3, 4, 5],
"month": ["Jun", "Jul", "Aug", "Sep", "Oct"],
"year": [2019, 2020, 2021, 2022, 2023],
}
df = pd.DataFrame(data)
print(df)
Run the code, and you’ll get the following DataFrame:
day month year
0 1 Jun 2019
1 2 Jul 2020
2 3 Aug 2021
3 4 Sep 2022
4 5 Oct 2023
Next, add the following syntax to perform the concatenation (using “-” as a separator):
df["full_date"] = df["day"].map(str) + "-" + df["month"].map(str) + "-" + df["year"].map(str)
So the complete Python code would look like this:
import pandas as pd
data = {
"day": [1, 2, 3, 4, 5],
"month": ["Jun", "Jul", "Aug", "Sep", "Oct"],
"year": [2019, 2020, 2021, 2022, 2023],
}
df = pd.DataFrame(data)
df["full_date"] = df["day"].map(str) + "-" + df["month"].map(str) + "-" + df["year"].map(str)
print(df)
Run the code, and you’ll get the concatenated full date (as highlighted in yellow):
day month year full_date
0 1 Jun 2019 1-Jun-2019
1 2 Jul 2020 2-Jul-2020
2 3 Aug 2021 3-Aug-2021
3 4 Sep 2022 4-Sep-2022
4 5 Oct 2023 5-Oct-2023
Example 2: Concatenating column values from two separate DataFrames
In the previous example, you saw how to create the first DataFrame based on this data:
day | month | year |
1 | Jun | 2019 |
2 | Jul | 2020 |
3 | Aug | 2021 |
4 | Sep | 2022 |
5 | Oct | 2023 |
Now create a second DataFrame based on this data:
unemployment_rate | interest_rate |
5.5 | 1.75 |
5 | 1.5 |
5.2 | 1.25 |
5.1 | 1.5 |
4.9 | 2 |
The goal is to concatenate the values from the two DataFrames as follows:
day-month-year: unemployment_rate; interest_rate
To accomplish this goal, you may apply the following code:
import pandas as pd
data1 = {
"day": [1, 2, 3, 4, 5],
"month": ["Jun", "Jul", "Aug", "Sep", "Oct"],
"year": [2019, 2020, 2021, 2022, 2023],
}
df1 = pd.DataFrame(data1)
data2 = {
"unemployment_rate": [5.5, 5, 5.2, 5.1, 4.9],
"interest_rate": [1.75, 1.5, 1.25, 1.5, 2],
}
df2 = pd.DataFrame(data2)
combined_values = (
df1["day"].map(str)
+ "-"
+ df1["month"].map(str)
+ "-"
+ df1["year"].map(str)
+ ": "
+ "Unemployment: "
+ df2["unemployment_rate"].map(str)
+ "; "
+ "Interest: "
+ df2["interest_rate"].map(str)
)
print(combined_values)
The result:
0 1-Jun-2019: Unemployment: 5.5; Interest: 1.75
1 2-Jul-2020: Unemployment: 5.0; Interest: 1.5
2 3-Aug-2021: Unemployment: 5.2; Interest: 1.25
3 4-Sep-2022: Unemployment: 5.1; Interest: 1.5
4 5-Oct-2023: Unemployment: 4.9; Interest: 2.0
Example 3: Concatenating values, and then finding the Max
In the last example, you’ll see how to concatenate 2 DataFrames (which contain only numeric values), and then find the maximum value:
import pandas as pd
data1 = {"set1": [55, 22, 11, 77, 33]}
df1 = pd.DataFrame(data1)
data2 = {"set2": [23, 45, 21, 73, 48]}
df2 = pd.DataFrame(data2)
concatenated = df1["set1"].map(str) + df2["set2"].map(str)
combined = pd.DataFrame(concatenated)
max_value = combined.max()
print(max_value)
And the result that you’ll get is 7773, which is indeed the maximum value:
7773