How to Union Pandas DataFrames using Concat

You can union Pandas DataFrames using concat:

pd.concat([df1, df2])

You may concatenate additional DataFrames by adding them within the brackets.

Steps to Union Pandas DataFrames using Concat

Step 1: Create the first DataFrame

For example, let’s say that you have the following DataFrame about products:

import pandas as pd

data1 = {
    "product": ["computer", "tablet", "monitor", "printer"],
    "brand": ["AA", "BB", "CC", "DD"],
    "price": [1200, 350, 500, 150],
}

df1 = pd.DataFrame(data1)

print(df1)

Run the code in Python and you’ll get:

    product brand  price
0  computer    AA   1200
1    tablet    BB    350
2   monitor    CC    500
3   printer    DD    150

Step 2: Create the second DataFrame

Next, create the second DataFrame with data about additional products:

import pandas as pd

data2 = {
    "product": ["keyboard", "mouse", "speakers", "scanner"],
    "brand": ["EE", "FF", "GG", "HH"],
    "price": [120, 50, 200, 180],
}

df2 = pd.DataFrame(data2)

print(df2)

Run the code, and you’ll see:

    product brand  price
0  keyboard    EE    120
1     mouse    FF     50
2  speakers    GG    200
3   scanner    HH    180

Step 3: Union Pandas DataFrames using Concat

Finally, to union the two Pandas DataFrames together, you may use:

pd.concat([df1, df2])

Here is the complete Python code to union the Pandas DataFrames using concat (note that you’ll need to keep the same column names across all the DataFrames to avoid any NaN values):

import pandas as pd

data1 = {
    "product": ["computer", "tablet", "monitor", "printer"],
    "brand": ["AA", "BB", "CC", "DD"],
    "price": [1200, 350, 500, 150],
}

df1 = pd.DataFrame(data1)


data2 = {
    "product": ["keyboard", "mouse", "speakers", "scanner"],
    "brand": ["EE", "FF", "GG", "HH"],
    "price": [120, 50, 200, 180],
}

df2 = pd.DataFrame(data2)

union_dfs = pd.concat([df1, df2])
print(union_dfs)

Once you run the code, you’ll get the concatenated DataFrames:

    product brand  price
0  computer    AA   1200
1    tablet    BB    350
2   monitor    CC    500
3   printer    DD    150
0  keyboard    EE    120
1     mouse    FF     50
2  speakers    GG    200
3   scanner    HH    180

Notice that the index values keep repeating themselves (from 0 to 3 for the first DataFrame, and then from 0 to 3 for the second DataFrame):

    product brand  price
0  computer    AA   1200
1    tablet    BB    350
2   monitor    CC    500
3   printer    DD    150
0  keyboard    EE    120
1     mouse    FF     50
2  speakers    GG    200
3   scanner    HH    180

You may then assign the index values in an incremental manner once you concatenated the two DataFrames.

To do so, simply set ignore_index=True within the pd.concat brackets:

import pandas as pd

data1 = {
    "product": ["computer", "tablet", "monitor", "printer"],
    "brand": ["AA", "BB", "CC", "DD"],
    "price": [1200, 350, 500, 150],
}

df1 = pd.DataFrame(data1)


data2 = {
    "product": ["keyboard", "mouse", "speakers", "scanner"],
    "brand": ["EE", "FF", "GG", "HH"],
    "price": [120, 50, 200, 180],
}

df2 = pd.DataFrame(data2)

union_dfs = pd.concat([df1, df2], ignore_index=True)
print(union_dfs)

And the result:

    product brand  price
0  computer    AA   1200
1    tablet    BB    350
2   monitor    CC    500
3   printer    DD    150
4  keyboard    EE    120
5     mouse    FF     50
6  speakers    GG    200
7   scanner    HH    180

For additional information about concatenating DataFrames, please visit the pandas.concat documentation.

You may also want to check the following guide that explains how to concatenate column values using Pandas.

Leave a Comment