Check for NaN in Pandas DataFrame

Here are 4 ways to check for NaN in Pandas DataFrame:

(1) Check for NaN under a single DataFrame column:

df['column name'].isnull().values.any()

(2) Count the NaN under a single DataFrame column:

df['column name'].isnull().sum()

(3) Check for NaN under an entire DataFrame:

df.isnull().values.any()

(4) Count the NaN under an entire DataFrame:

df.isnull().sum().sum()

Examples of checking for NaN in Pandas DataFrame

(1) Check for NaN under a single DataFrame column

In the following example, we’ll create a DataFrame with a set of numbers and 3 NaN values:

import pandas as pd
import numpy as np

data = {'set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan]}

df = pd.DataFrame(data)

print(df)

You’ll now see the DataFrame with the 3 NaN values:

    set_of_numbers
0              1.0
1              2.0
2              3.0
3              4.0
4              5.0
5              NaN
6              6.0
7              7.0
8              NaN
9              8.0
10             9.0
11            10.0
12             NaN

You can then use the following template in order to check for NaN under a single DataFrame column:

df['column name'].isnull().values.any()

For our example (where the desired column name is ‘set_of_numbers‘):

import pandas as pd
import numpy as np

data = {'set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan]}

df = pd.DataFrame(data)

check_for_nan = df['set_of_numbers'].isnull().values.any()

print(check_for_nan)

Run the code, and you’ll get ‘True‘ which confirms the existence of NaN values under the DataFrame column:

True

And if you want to get the actual breakdown of the instances where NaN values exist, then you may remove .values.any() from the code:

import pandas as pd
import numpy as np

data = {'set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan]}

df = pd.DataFrame(data)

check_for_nan = df['set_of_numbers'].isnull()

print(check_for_nan)

You’ll now see the 3 instances of the NaN values:

0     False
1     False
2     False
3     False
4     False
5      True
6     False
7     False
8      True
9     False
10    False
11    False
12     True

Here is another approach where you can get all the instances where a NaN value exists:

import pandas as pd
import numpy as np

data = {'set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan]}

df = pd.DataFrame(data)

df.loc[df['set_of_numbers'].isnull(), 'value_is_NaN'] = 'Yes'
df.loc[df['set_of_numbers'].notnull(), 'value_is_NaN'] = 'No'

print(df)

You’ll now see a new column (called ‘value_is_NaN’), which indicates all the instances where a NaN value exists:

    set_of_numbers  value_is_NaN
0              1.0            No
1              2.0            No
2              3.0            No
3              4.0            No
4              5.0            No
5              NaN           Yes
6              6.0            No
7              7.0            No
8              NaN           Yes
9              8.0            No
10             9.0            No
11            10.0            No
12             NaN           Yes

(2) Count the NaN under a single DataFrame column

You can apply this syntax in order to count the NaN values under a single DataFrame column:

df['column name'].isnull().sum()

For our example:

import pandas as pd
import numpy as np

data = {'set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan]}

df = pd.DataFrame(data)

count_nan = df['set_of_numbers'].isnull().sum()

print('Count of NaN: ' + str(count_nan))

You’ll then get the count of 3 NaN values:

Count of NaN: 3

And here is another approach to get the count:

import pandas as pd
import numpy as np

data = {'set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan]}

df = pd.DataFrame(data)

df.loc[df['set_of_numbers'].isnull(), 'value_is_NaN'] = 'Yes'
df.loc[df['set_of_numbers'].notnull(), 'value_is_NaN'] = 'No'

count_nan = df.loc[df['value_is_NaN'] == 'Yes'].count()

print(count_nan)

As before, you’ll get the count of 3 instances of NaN values:

value_is_NaN      3

(3) Check for NaN under an entire DataFrame

Now let’s add a second column into the original DataFrame. This column would include another set of numbers with NaN values:

import pandas as pd
import numpy as np

data = {'first_set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan],
        'second_set_of_numbers': [11, 12, np.nan, 13, 14, np.nan, 15, 16, np.nan, np.nan, 17, np.nan, 19]}

df = pd.DataFrame(data)

print(df)

Run the code, and you’ll get 8 instances of NaN values across the entire DataFrame:

    first_set_of_numbers  second_set_of_numbers
0                    1.0                   11.0
1                    2.0                   12.0
2                    3.0                    NaN
3                    4.0                   13.0
4                    5.0                   14.0
5                    NaN                    NaN
6                    6.0                   15.0
7                    7.0                   16.0
8                    NaN                    NaN
9                    8.0                    NaN
10                   9.0                   17.0
11                  10.0                    NaN
12                   NaN                   19.0

You can then apply this syntax in order to verify the existence of NaN values under the entire DataFrame:

df.isnull().values.any()

For our example:

import pandas as pd
import numpy as np

data = {'first_set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan],
        'second_set_of_numbers': [11, 12, np.nan, 13, 14, np.nan, 15, 16, np.nan, np.nan, 17, np.nan, 19]}

df = pd.DataFrame(data)

check_nan_in_df = df.isnull().values.any()

print(check_nan_in_df)

Once you run the code, you’ll get ‘True‘ which confirms the existence of NaN values in the DataFrame:

True

You can get a further breakdown by removing .values.any() from the code:

import pandas as pd
import numpy as np

data = {'first_set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan],
        'second_set_of_numbers': [11, 12, np.nan, 13, 14, np.nan, 15, 16, np.nan, np.nan, 17, np.nan, 19]}

df = pd.DataFrame(data)

check_nan_in_df = df.isnull()

print(check_nan_in_df)

Here is the result of the breakdown:

    first_set_of_numbers  second_set_of_numbers
0                  False                  False
1                  False                  False
2                  False                   True
3                  False                  False
4                  False                  False
5                   True                   True
6                  False                  False
7                  False                  False
8                   True                   True
9                  False                   True
10                 False                  False
11                 False                   True
12                  True                  False

(4) Count the NaN under an entire DataFrame

You may now use this template to count the NaN values under the entire DataFrame:

df.isnull().sum().sum()

Here is the code for our example:

import pandas as pd
import numpy as np

data = {'first_set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan],
        'second_set_of_numbers': [11, 12, np.nan, 13, 14, np.nan, 15, 16, np.nan, np.nan, 17, np.nan, 19]}

df = pd.DataFrame(data)

count_nan_in_df = df.isnull().sum().sum()

print('Count of NaN: ' + str(count_nan_in_df))

You’ll then get the total count of 8:

Count of NaN: 8

And if you want to get the count of NaN by column, then you may use the following code:

import pandas as pd
import numpy as np

data = {'first_set_of_numbers': [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, 8, 9, 10, np.nan],
        'second_set_of_numbers': [11, 12, np.nan, 13, 14, np.nan, 15, 16, np.nan, np.nan, 17, np.nan, 19]}

df = pd.DataFrame(data)

count_nan_in_df = df.isnull().sum()

print(count_nan_in_df)

And here is the result:

first_set_of_numbers     3
second_set_of_numbers    5

You just saw how to check for NaN in Pandas DataFrame. Alternatively you may: