Select all Rows with NaN Values in Pandas DataFrame

Here are 4 ways to select all rows with NaN values in Pandas DataFrame:

(1) Using isna() to select all rows with NaN under a single DataFrame column:

df[df['column name'].isna()]

(2) Using isnull() to select all rows with NaN under a single DataFrame column:

df[df['column name'].isnull()]

(3) Using isna() to select all rows with NaN under an entire DataFrame:

df[df.isna().any(axis=1)]

(4) Using isnull() to select all rows with NaN under an entire DataFrame:

df[df.isnull().any(axis=1)]

Steps to select all rows with NaN values in Pandas DataFrame

Step 1: Create a DataFrame

To start with a simple example, let’s create a DataFrame with two sets of values:

  • Numeric values with NaN
  • String/text values with NaN

Here is the code to create the DataFrame in Python:

import pandas as pd
import numpy as np

data = {'first_set': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan],
        'second_set': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, np.nan, 'f', 'g', np.nan, 'h', 'i']
        }

df = pd.DataFrame(data)

print(df)

As you can see, there are two columns that contain NaN values:

    first_set   second_set
0         1.0            a
1         2.0            b
2         3.0          NaN
3         4.0          NaN
4         5.0            c
5         NaN            d
6         6.0            e
7         7.0          NaN
8         NaN          NaN
9         NaN            f
10        8.0            g
11        9.0          NaN
12       10.0            h
13        NaN            i

The goal is to select all rows with the NaN values under the ‘first_set‘ column. Later, you’ll also see how to get the rows with the NaN values under the entire DataFrame.

Step 2: Select all rows with NaN under a single DataFrame column

You may use the isna() approach to select the NaNs:

df[df['column name'].isna()]

Here is the complete code for our example:

import pandas as pd
import numpy as np

data = {'first_set': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan],
        'second_set': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, np.nan, 'f', 'g', np.nan, 'h', 'i']
        }

df = pd.DataFrame(data)

nan_values = df[df['first_set'].isna()]

print(nan_values)

You’ll now see all the rows with the NaN values under the ‘first_set‘ column:

    first_set   second_set
5         NaN            d
8         NaN          NaN
9         NaN            f
13        NaN            i

You’ll get the same results using isnull():

df[df['column name'].isnull()]

And here is the complete code:

import pandas as pd
import numpy as np

data = {'first_set': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan],
        'second_set': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, np.nan, 'f', 'g', np.nan, 'h', 'i']
        }

df = pd.DataFrame(data)

nan_values = df[df['first_set'].isnull()]

print(nan_values)

As before, you’ll get the rows with the NaNs under the ‘first_set‘ column:

    first_set   second_set
5         NaN            d
8         NaN          NaN
9         NaN            f
13        NaN            i

Select all rows with NaN under the entire DataFrame

To find all rows with NaN under the entire DataFrame, you may apply this syntax:

df[df.isna().any(axis=1)]

For our example:

import pandas as pd
import numpy as np

data = {'first_set': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan],
        'second_set': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, np.nan, 'f', 'g', np.nan, 'h', 'i']
        }

df = pd.DataFrame(data)

nan_values = df[df.isna().any(axis=1)]

print(nan_values)

Once you run the code, you’ll get all the rows with the NaNs under the entire DataFrame (i.e., under both the ‘first_set‘ as well as the ‘second_set‘ columns):

    first_set   second_set
2         3.0          NaN
3         4.0          NaN
5         NaN            d
7         7.0          NaN
8         NaN          NaN
9         NaN            f
11        9.0          NaN
13        NaN            i

Optionally, you’ll get the same results using isnull():

import pandas as pd
import numpy as np

data = {'first_set': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan],
        'second_set': ['a', 'b', np.nan, np.nan, 'c', 'd', 'e', np.nan, np.nan, 'f', 'g', np.nan, 'h', 'i']
        }

df = pd.DataFrame(data)

nan_values = df[df.isnull().any(axis=1)]

print(nan_values)

Run the code in Python, and you’ll get the following:

    first_set   second_set
2         3.0          NaN
3         4.0          NaN
5         NaN            d
7         7.0          NaN
8         NaN          NaN
9         NaN            f
11        9.0          NaN
13        NaN            i

Additional resources:

You may refer to the following guides that explain how to:

For additional information, please refer to the Pandas Documentation.