Here are 4 ways to select all rows with NaN values in Pandas DataFrame:
(1) Using isna() to select all rows with NaN under a single DataFrame column:
df[df['column name'].isna()]
(2) Using isnull() to select all rows with NaN under a single DataFrame column:
df[df['column name'].isnull()]
(3) Using isna() to select all rows with NaN under an entire DataFrame:
df[df.isna().any(axis=1)]
(4) Using isnull() to select all rows with NaN under an entire DataFrame:
df[df.isnull().any(axis=1)]
Next, you’ll see few examples with the steps to apply the above syntax in practice.
Steps to select all rows with NaN values in Pandas DataFrame
Step 1: Create a DataFrame
To start with a simple example, let’s create a DataFrame with two sets of values:
- Numeric values with NaN
- String/text values with NaN
Here is the code to create the DataFrame in Python:
import pandas as pd import numpy as np data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan,8,9,10,np.nan], 'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f','g',np.nan,'h','i'] } df = pd.DataFrame(data,columns=['first_set','second_set']) print (df)
As you can see, there are two columns that contain NaN values:
first_set second_set
0 1.0 a
1 2.0 b
2 3.0 NaN
3 4.0 NaN
4 5.0 c
5 NaN d
6 6.0 e
7 7.0 NaN
8 NaN NaN
9 NaN f
10 8.0 g
11 9.0 NaN
12 10.0 h
13 NaN i
The goal is to select all rows with the NaN values under the ‘first_set‘ column. Later, you’ll also see how to get the rows with the NaN values under the entire DataFrame.
Step 2: Select all rows with NaN under a single DataFrame column
You may use the isna() approach to select the NaNs:
df[df['column name'].isna()]
Here is the complete code for our example:
import pandas as pd import numpy as np data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan,8,9,10,np.nan], 'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f','g',np.nan,'h','i'] } df = pd.DataFrame(data,columns=['first_set','second_set']) nan_values = df[df['first_set'].isna()] print (nan_values)
You’ll now see all the rows with the NaN values under the ‘first_set‘ column:
first_set second_set
5 NaN d
8 NaN NaN
9 NaN f
13 NaN i
You’ll get the same results using isnull():
df[df['column name'].isnull()]
And here is the complete code:
import pandas as pd import numpy as np data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan,8,9,10,np.nan], 'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f','g',np.nan,'h','i'] } df = pd.DataFrame(data,columns=['first_set','second_set']) nan_values = df[df['first_set'].isnull()] print (nan_values)
As before, you’ll get the rows with the NaNs under the ‘first_set‘ column:
first_set second_set
5 NaN d
8 NaN NaN
9 NaN f
13 NaN i
Select all rows with NaN under the entire DataFrame
To find all rows with NaN under the entire DataFrame, you may apply this syntax:
df[df.isna().any(axis=1)]
For our example:
import pandas as pd import numpy as np data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan,8,9,10,np.nan], 'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f','g',np.nan,'h','i'] } df = pd.DataFrame(data,columns=['first_set','second_set']) nan_values = df[df.isna().any(axis=1)] print (nan_values)
Once you run the code, you’ll get all the rows with the NaNs under the entire DataFrame (i.e., under both the ‘first_set‘ as well as the ‘second_set‘ columns):
first_set second_set
2 3.0 NaN
3 4.0 NaN
5 NaN d
7 7.0 NaN
8 NaN NaN
9 NaN f
11 9.0 NaN
13 NaN i
Optionally, you’ll get the same results using isnull():
import pandas as pd import numpy as np data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan,8,9,10,np.nan], 'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f','g',np.nan,'h','i'] } df = pd.DataFrame(data,columns=['first_set','second_set']) nan_values = df[df.isnull().any(axis=1)] print (nan_values)
Run the code in Python, and you’ll get the following:
first_set second_set
2 3.0 NaN
3 4.0 NaN
5 NaN d
7 7.0 NaN
8 NaN NaN
9 NaN f
11 9.0 NaN
13 NaN i
Additional resources:
You may refer to the following guides that explain how to:
For additional information, please refer to the Pandas Documentation.