How to Concatenate Column Values in Pandas DataFrame

In this short guide, you’ll see how to concatenate column values in Pandas DataFrame.

To start, you may use this template to concatenate your column values (for strings only):

df['New Column Name'] = df['1st Column Name'] + df['2nd Column Name'] + ...

Notice that the plus symbol (‘+’) is used to perform the concatenation.

Also note that if your dataset contains a combination of integers and strings for example, and you are trying to use the above template, you’ll then get this error:

TypeError: ufunc ‘add’ did not contain a loop with signature matching types

You can bypass this error by mapping the values to strings using the following syntax:

df['New Column Name'] = df['1st Column Name'].map(str) + df['2nd Column Name'].map(str) + ...

Next, you’ll see the following 3 examples that demonstrate how to concatenate column values in Pandas DataFrame:

  • Example 1: Concatenating values under a single DataFrame
  • Example 2: Concatenating column values from two separate DataFrames
  • Example 3: Concatenating values, and then finding the maximum value

Example 1: Concatenating values under a single DataFrame

Let’s say that you have the following dataset which contains 3 columns:

DayMonthYear
1Jun2016
2Jul2017
3Aug2018
4Sep2019
5Oct2020

The goal is to concatenate the column values as captured below:

Day-Month-Year

To begin, you’ll need to create a DataFrame to capture the above values in Python. You may use the following code to create the DataFrame:

import pandas as pd 

data = {'Day': [1,2,3,4,5], 
        'Month': ['Jun','Jul','Aug','Sep','Oct'], 
        'Year': [2016,2017,2018,2019,2020]} 

df = pd.DataFrame(data, columns= ['Day','Month','Year'])
print (df)

This is how the DataFrame would look like:

Example of DataFrame

Next, apply the following syntax to perform the concatenation (using ‘-‘ as a separator):

df['Full Date'] = df['Day'].map(str) + '-' + df['Month'].map(str) + '-' + df['Year'].map(str)

So your complete Python code would look like this:

import pandas as pd 

data = {'Day': [1,2,3,4,5], 
        'Month': ['Jun','Jul','Aug','Sep','Oct'], 
        'Year': [2016,2017,2018,2019,2020]} 

df = pd.DataFrame(data, columns= ['Day','Month','Year']) 

df['Full Date'] = df['Day'].map(str) + '-' + df['Month'].map(str) + '-' + df['Year'].map(str)
print (df)

Run the code, and you’ll get the concatenated full date (as highlighted in red):

How to Concatenate Column Values in Pandas DataFrame

Example 2: Concatenating column values from two separate DataFrames

Now you’ll see how to concatenate the column values from two separate DataFrames.

In the previous example, you saw how to create the first DataFrame based on this data:

DayMonthYear
1Jun2016
2Jul2017
3Aug2018
4Sep2019
5Oct2020

Let’s now create a second DataFrame based on the data below:

Unemployment RateInterest Rate
5.51.75
51.5
5.21.25
5.11.5
4.92

The goal is to concatenate the values from the two DataFrames as follows:

Day-Month-Year: Unemployment Rate; Interest Rate

To accomplish this goal, you may apply the following Python code:

import pandas as pd 

data1 = {'Day': [1,2,3,4,5], 
         'Month': ['Jun','Jul','Aug','Sep','Oct'], 
         'Year': [2016,2017,2018,2019,2020]} 

df1 = pd.DataFrame(data1, columns= ['Day','Month','Year']) 

data2 = {'Unemployment Rate': [5.5,5,5.2,5.1,4.9], 
         'Interest Rate': [1.75,1.5,1.25,1.5,2]} 

df2 = pd.DataFrame(data2, columns= ['Unemployment Rate','Interest Rate'])

combined_values = df1['Day'].map(str) + '-' + df1['Month'].map(str) + '-' + df1['Year'].map(str) + ': ' + 'Unemployment: ' + df2['Unemployment Rate'].map(str) + '; ' + 'Interest: ' + df2['Interest Rate'].map(str)
print (combined_values)

And once your run the Python code, you’ll get this result:

How to Concatenate Column Values in Pandas DataFrame

Example 3: Concatenating values, and then finding the Maximum

In the last example, you’ll see how to concatenate the 2 DataFrames below (which would contain only numeric values), and then find the maximum value.

The purpose of this exercise is to demonstrate that you can apply different arithmetic/statistical operations after you concatenated 2 separate DataFrames.

The 1st DataFrame would contain this set of numbers:

data1 = {'Set1': [55,22,11,77,33]} 
df1 = pd.DataFrame(data1, columns= ['Set1']) 

While the 2nd DataFrame would contain this set of numbers:

data2 = {'Set2': [23,45,21,73,48]} 
df2 = pd.DataFrame(data2, columns= ['Set2'])

You can then concatenate these 2 DataFrames, and then find the maximum value by using the code below:

import pandas as pd 

data1 = {'Set1': [55,22,11,77,33]} 
df1 = pd.DataFrame(data1, columns= ['Set1']) 

data2 = {'Set2': [23,45,21,73,48]} 
df2 = pd.DataFrame(data2, columns= ['Set2'])

concatenated = df1['Set1'].map(str) + df2['Set2'].map(str)

combined = pd.DataFrame(concatenated, columns=['Combined Values'])
max1 = combined['Combined Values'].max()

print (max1)

And the result that you’ll get is 7773, which is indeed the maximum value.

To learn more about Pandas DataFrame, you may check the Pandas Documentation.