How to Compare Values Between Two pandas DataFrames
In this tutorial, you will learn how to compare values between two DataFrames.
Step-by-Step Example
Let's say, you have data on caught fish by two fishing boats:
import pandas as pd
boat1 = {
'fish': ['salmon', 'pufferfish', 'shark'],
'count': [99, 33, 11]
}
boat2 = {
'fish': ['salmon', 'pufferfish'],
'count': [88, 22]
}
df1 = pd.DataFrame(boat1)
df2 = pd.DataFrame(boat2)
print(df1)
print(df2)
df1:
fish count
0 salmon 99
1 pufferfish 33
2 shark 11
df2:
fish count
0 salmon 88
1 pufferfish 22
To compare them, you first have to merge the two DataFrames:
df = pd.merge(df1, df2, how='inner', on='fish',
suffixes=('_boat1', '_boat2'))
print(df)
fish count_boat1 count_boat2
0 salmon 99 88
1 pufferfish 33 22
You can then compute differences:
df['count_diff'] = df['count_boat1'] - df['count_boat2']
print(df)
fish count_boat1 count_boat2 count_diff
0 salmon 99 88 11
1 pufferfish 33 22 11
That's it! You just learned how to compare two DataFrames.