How to Concatenate Column Values in Pandas DataFrame

In this short guide, you’ll see how to concatenate column values in Pandas DataFrame.

To start, you may use this template to concatenate your column values (for strings only):

df['new_column_name'] = df['1st_column_name'] + df['2nd_column_name'] + ...

Notice that the plus symbol (‘+’) is used to perform the concatenation.

Also note that if your dataset contains a combination of integers and strings for example, and you are trying to use the above template, you’ll then get this error:

TypeError: ufunc ‘add’ did not contain a loop with signature matching types

You can bypass this error by mapping the values to strings using the following syntax:

df['new_column_name'] = df['1st_column_name'].map(str) + df['2nd_column_name'].map(str) + ...

Next, you’ll see the following 3 examples that demonstrate how to concatenate column values in Pandas DataFrame:

  • Example 1: Concatenating values under a single DataFrame
  • Example 2: Concatenating column values from two separate DataFrames
  • Example 3: Concatenating values, and then finding the maximum value

Example 1: Concatenating values under a single DataFrame

Let’s say that you have the following dataset which contains 3 columns:

day month year
1 Jun 2016
2 Jul 2017
3 Aug 2018
4 Sep 2019
5 Oct 2020

The goal is to concatenate the column values as captured below:

day-month-year

To begin, you’ll need to create a DataFrame to capture the above values in Python:

import pandas as pd 

data = {'day': [1, 2, 3, 4, 5], 
        'month': ['Jun', 'Jul', 'Aug', 'Sep', 'Oct'], 
        'year': [2016, 2017, 2018, 2019, 2020]
        } 

df = pd.DataFrame(data)
print(df)

This is how the DataFrame would look like:

   day month  year
0    1   Jun  2016
1    2   Jul  2017
2    3   Aug  2018
3    4   Sep  2019
4    5   Oct  2020

Next, apply the following syntax to perform the concatenation (using ‘-‘ as a separator):

df['full_date'] = df['day'].map(str) + '-' + df['month'].map(str) + '-' + df['year'].map(str)

So your complete Python code would look like this:

import pandas as pd 

data = {'day': [1, 2, 3, 4, 5], 
        'month': ['Jun', 'Jul', 'Aug', 'Sep', 'Oct'], 
        'year': [2016, 2017, 2018, 2019, 2020]
        } 

df = pd.DataFrame(data) 

df['full_date'] = df['day'].map(str) + '-' + df['month'].map(str) + '-' + df['year'].map(str)
print(df)

Run the code, and you’ll get the concatenated full date (as highlighted in yellow):

   day month  year   full_date
0    1   Jun  2016  1-Jun-2016
1    2   Jul  2017  2-Jul-2017
2    3   Aug  2018  3-Aug-2018
3    4   Sep  2019  4-Sep-2019
4    5   Oct  2020  5-Oct-2020

Example 2: Concatenating column values from two separate DataFrames

Now you’ll see how to concatenate the column values from two separate DataFrames.

In the previous example, you saw how to create the first DataFrame based on this data:

day month year
1 Jun 2016
2 Jul 2017
3 Aug 2018
4 Sep 2019
5 Oct 2020

Let’s now create a second DataFrame based on the data below:

unemployment_rate interest_rate
5.5 1.75
5 1.5
5.2 1.25
5.1 1.5
4.9 2

The goal is to concatenate the values from the two DataFrames as follows:

day-month-year: unemployment_rate;  interest_rate

To accomplish this goal, you may apply the following Python code:

import pandas as pd

data1 = {'day': [1, 2, 3, 4, 5],
         'month': ['Jun', 'Jul', 'Aug', 'Sep', 'Oct'],
         'year': [2016, 2017, 2018, 2019, 2020]
         }

df1 = pd.DataFrame(data1)

data2 = {'unemployment_rate': [5.5, 5, 5.2, 5.1, 4.9],
         'interest_rate': [1.75, 1.5, 1.25, 1.5, 2]
         }

df2 = pd.DataFrame(data2)

combined_values = df1['day'].map(str) + '-' + df1['month'].map(str) + '-' + df1['year'].map(str) + ': ' \
                  + 'Unemployment: ' + df2['unemployment_rate'].map(str) + '; ' \
                  + 'Interest: ' + df2['interest_rate'].map(str)

print(combined_values)

And once your run the Python code, you’ll get the following result:

0    1-Jun-2016: Unemployment: 5.5; Interest: 1.75
1     2-Jul-2017: Unemployment: 5.0; Interest: 1.5
2    3-Aug-2018: Unemployment: 5.2; Interest: 1.25
3     4-Sep-2019: Unemployment: 5.1; Interest: 1.5
4     5-Oct-2020: Unemployment: 4.9; Interest: 2.0

Example 3: Concatenating values, and then finding the Maximum

In the last example, you’ll see how to concatenate the 2 DataFrames below (which contain only numeric values), and then find the maximum value.

The purpose of this exercise is to demonstrate that you can apply different arithmetic/statistical operations after you concatenated 2 separate DataFrames.

The 1st DataFrame would contain this set of numbers:

data1 = {'set1': [55, 22, 11, 77, 33]} 
df1 = pd.DataFrame(data1) 

While the 2nd DataFrame would contain this set of numbers:

data2 = {'set2': [23, 45, 21, 73, 48]} 
df2 = pd.DataFrame(data2)

You can then concatenate these 2 DataFrames, and then find the maximum value by using the code below:

import pandas as pd 

data1 = {'set1': [55, 22, 11, 77, 33]} 
df1 = pd.DataFrame(data1) 

data2 = {'set2': [23, 45, 21, 73, 48]} 
df2 = pd.DataFrame(data2)

concatenated = df1['set1'].map(str) + df2['set2'].map(str)

combined = pd.DataFrame(concatenated)
max_value = combined.max()

print(max_value)

And the result that you’ll get is 7773, which is indeed the maximum value:

7773

To learn more about Pandas DataFrame, you may check the Pandas Documentation.