You can union Pandas DataFrames using contact:
pd.concat([df1, df2])
You may concatenate additional DataFrames by adding them within the brackets.
In the next section, you’ll see an example with the steps to union Pandas DataFrames using contact.
Steps to Union Pandas DataFrames using Concat
Step 1: Create the first DataFrame
For example, let’s say that you have the following data about your customers:
clientFirstName | clientLastName | country |
Jon | Smith | US |
Maria | Lam | Canada |
Bruce | Jones | Italy |
Lili | Chang | China |
You can then create a DataFrame to capture the above data in Python:
import pandas as pd clients1 = {'clientFirstName': ['Jon','Maria','Bruce','Lili'], 'clientLastName': ['Smith','Lam','Jones','Chang'], 'country': ['US','Canada','Italy','China'] } df1 = pd.DataFrame(clients1, columns= ['clientFirstName', 'clientLastName','country']) print (df1)
Run the code in Python and you would get:
Step 2: Create the second DataFrame
Now suppose that you got an additional data about new customers:
clientFirstName | clientLastName | country |
Bill | Jackson | UK |
Jack | Green | Germany |
Elizabeth | Gross | Brazil |
Jenny | Sing | Japan |
You can then create the second DataFrame as follows:
import pandas as pd clients2 = {'clientFirstName': ['Bill','Jack','Elizabeth','Jenny'], 'clientLastName': ['Jackson','Green','Gross','Sing'], 'country': ['UK','Germany','Brazil','Japan'] } df2 = pd.DataFrame(clients2, columns= ['clientFirstName', 'clientLastName','country']) print (df2)
Run the code, and you’ll see:
Your goal is to union those two DataFrames together. You can then use Pandas concat to accomplish this goal.
Step 3: Union Pandas DataFrames using Concat
Finally, to union the two Pandas DataFrames together, you can apply the generic syntax that you saw at the beginning of this guide:
pd.concat([df1, df2])
And here is the complete Python code to union Pandas DataFrames using concat:
import pandas as pd clients1 = {'clientFirstName': ['Jon','Maria','Bruce','Lili'], 'clientLastName': ['Smith','Lam','Jones','Chang'], 'country': ['US','Canada','Italy','China'] } df1 = pd.DataFrame(clients1, columns= ['clientFirstName', 'clientLastName','country']) clients2 = {'clientFirstName': ['Bill','Jack','Elizabeth','Jenny'], 'clientLastName': ['Jackson','Green','Gross','Sing'], 'country': ['UK','Germany','Brazil','Japan'] } df2 = pd.DataFrame(clients2, columns= ['clientFirstName', 'clientLastName','country']) union = pd.concat([df1, df2]) print (union)
Once you run the code, you’ll get the concatenated DataFrames:
Notice that the index values keep repeating themselves (from 0 to 3 for the first DataFrame, and then from 0 to 3 for the second DataFrame):
You may then choose to assign the index values in an incremental manner once you concatenated the two DataFrames.
To do so, simply set ignore_index=True within the pd.concat brackets:
import pandas as pd clients1 = {'clientFirstName': ['Jon','Maria','Bruce','Lili'], 'clientLastName': ['Smith','Lam','Jones','Chang'], 'country': ['US','Canada','Italy','China'] } df1 = pd.DataFrame(clients1, columns= ['clientFirstName', 'clientLastName','country']) clients2 = {'clientFirstName': ['Bill','Jack','Elizabeth','Jenny'], 'clientLastName': ['Jackson','Green','Gross','Sing'], 'country': ['UK','Germany','Brazil','Japan'] } df2 = pd.DataFrame(clients2, columns= ['clientFirstName', 'clientLastName','country']) union = pd.concat([df1, df2], ignore_index=True) print (union)
And the result:
That’s it! The above method that you just saw would work even if you have more than 2 DataFrames. Note that you’ll need to keep the same column names across all the DataFrames to avoid any NaN values.
For additional information about concatenating DataFrames, please visit the Pandas.concat documentation.
You may also want to check the following tutorial that explains how to concatenate column values using Pandas.