Here are 4 ways to find all columns that contain NaN values in Pandas DataFrame:
(1) Use isna() to find all columns with NaN values:
df.isna().any()
(2) Use isnull() to find all columns with NaN values:
df.isnull().any()
(3) Use isna() to select all columns with NaN values:
df[df.columns[df.isna().any()]]
(4) Use isnull() to select all columns with NaN values:
df[df.columns[df.isnull().any()]]
Steps to Find all Columns with NaN Values in Pandas DataFrame
Step 1: Create a DataFrame
For example, let’s create a DataFrame with 4 columns:
import pandas as pd import numpy as np data = {'Column_A': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan], 'Column_B': [11, 22, 33, 44, 55, 66, 77, 88, 99], 'Column_C': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, 'f'], 'Column_D': ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh', 'ii'] } df = pd.DataFrame(data) print(df)
Notice that some of the columns in the DataFrame contain NaN values:
Column_A Column_B Column_C Column_D
0 1.0 11 a aa
1 2.0 22 b bb
2 3.0 33 NaN cc
3 4.0 44 NaN dd
4 5.0 55 c ee
5 NaN 66 d ff
6 6.0 77 e gg
7 7.0 88 NaN hh
8 NaN 99 f ii
Step 2: Find all Columns with NaN Values in Pandas DataFrame
You can use isna() to find all the columns with the NaN values:
df.isna().any()
For our example:
import pandas as pd import numpy as np data = {'Column_A': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan], 'Column_B': [11, 22, 33, 44, 55, 66, 77, 88, 99], 'Column_C': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, 'f'], 'Column_D': ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh', 'ii'] } df = pd.DataFrame(data) nan_values = df.isna().any() print(nan_values)
As you can see, for both ‘Column_A‘ and ‘Column_C‘ the outcome is ‘True’ which means that those two columns contain NaNs:
Column_A True
Column_B False
Column_C True
Column_D False
dtype: bool
Alternatively, you’ll get the same results using isnull():
df.isnull().any()
Here is the complete code:
import pandas as pd import numpy as np data = {'Column_A': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan], 'Column_B': [11, 22, 33, 44, 55, 66, 77, 88, 99], 'Column_C': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, 'f'], 'Column_D': ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh', 'ii'] } df = pd.DataFrame(data) nan_values = df.isnull().any() print(nan_values)
As before, both ‘Column_A‘ and ‘Column_C‘ contain NaN values:
Column_A True
Column_B False
Column_C True
Column_D False
dtype: bool
Select all Columns with NaN Values in Pandas DataFrame
What if you’d like to select all the columns with the NaN values?
In that case, you can use the following approach to select all those columns with NaNs:
df[df.columns[df.isna().any()]]
Therefore, the new Python code would be:
import pandas as pd import numpy as np data = {'Column_A': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan], 'Column_B': [11, 22, 33, 44, 55, 66, 77, 88, 99], 'Column_C': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, 'f'], 'Column_D': ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh', 'ii'] } df = pd.DataFrame(data) nan_values = df[df.columns[df.isna().any()]] print(nan_values)
You’ll now get the complete two columns that contain the NaN values:
Column_A Column_C
0 1.0 a
1 2.0 b
2 3.0 NaN
3 4.0 NaN
4 5.0 c
5 NaN d
6 6.0 e
7 7.0 NaN
8 NaN f
Optionally, you can use isnull() to get the same results:
import pandas as pd import numpy as np data = {'Column_A': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan], 'Column_B': [11, 22, 33, 44, 55, 66, 77, 88, 99], 'Column_C': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, 'f'], 'Column_D': ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh', 'ii'] } df = pd.DataFrame(data) nan_values = df[df.columns[df.isnull().any()]] print(nan_values)
Run the code, and you’ll get the same two columns with the NaN values:
Column_A Column_C
0 1.0 a
1 2.0 b
2 3.0 NaN
3 4.0 NaN
4 5.0 c
5 NaN d
6 6.0 e
7 7.0 NaN
8 NaN f
You can visit the Pandas Documentation to learn more about isna.