How to Count NaN values in Pandas DataFrame

You can use the following syntax to count NaN values in Pandas DataFrame:

(1) Count NaN values under a single DataFrame column:

df['column name'].isna().sum()

(2) Count NaN values under an entire DataFrame:

df.isna().sum().sum()

(3) Count NaN values across a single DataFrame row:

df.loc[[index value]].isna().sum().sum()

Let’s see how to apply each of the above cases using a practical example.

The Example

Suppose you created the following DataFrame that contains NaN values:

import pandas as pd
import numpy as np

data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan],
        'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f'],
        'third_set':['aa',np.nan,'bb','cc',np.nan,np.nan,'dd',np.nan,np.nan,'ee']
        }

df = pd.DataFrame(data,columns=['first_set','second_set','third_set'])

print (df)

You’ll get this DataFrame with the NaNs:

   first_set   second_set   third_set
0        1.0            a          aa
1        2.0            b         NaN
2        3.0          NaN          bb
3        4.0          NaN          cc
4        5.0            c         NaN
5        NaN            d         NaN
6        6.0            e          dd
7        7.0          NaN         NaN
8        NaN          NaN         NaN
9        NaN            f          ee

Next, you’ll see how to count the NaN values in the above DataFrame for the following 3 scenarios:

  1. Under a single DataFrame column
  2. Under the entire DataFrame
  3. Across a single DataFrame row

(1) Count NaN values under a single DataFrame column

You can use the following template to count the NaN values under a single DataFrame column:

df['column name'].isna().sum()

For example, let’s get the count of NaNs under the ‘first_set‘ column:

import pandas as pd
import numpy as np

data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan],
        'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f'],
        'third_set':['aa',np.nan,'bb','cc',np.nan,np.nan,'dd',np.nan,np.nan,'ee']
        }

df = pd.DataFrame(data,columns=['first_set','second_set','third_set'])

count_nan = df['first_set'].isna().sum()

print ('Count of NaN: ' + str(count_nan))

As you can see, there are 3 NaN values under the ‘first_set’ column:

Count of NaN: 3

(2) Count NaN values under the entire DataFrame

What if you’d like to count the NaN values under an entire Pandas DataFrame?

In that case, you may use the following syntax to get the total count of NaNs:

df.isna().sum().sum()

For our example:

import pandas as pd
import numpy as np

data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan],
        'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f'],
        'third_set':['aa',np.nan,'bb','cc',np.nan,np.nan,'dd',np.nan,np.nan,'ee']
        }

df = pd.DataFrame(data,columns=['first_set','second_set','third_set'])

count_nan = df.isna().sum().sum()

print ('Count of NaN: ' + str(count_nan))

As you may observe, the total count of NaNs under the entire DataFrame is 12:

Count of NaN: 12

(3) Count NaN values across a single DataFrame row:

You can use the template below in order to count the NaNs across a single DataFrame row:

df.loc[[index value]].isna().sum().sum()

You’ll need to specify the index value that represents the row needed.

The index values are located on the left side of the DataFrame (starting from 0):

   first_set   second_set   third_set
0        1.0            a          aa
1        2.0            b         NaN
2        3.0          NaN          bb
3        4.0          NaN          cc
4        5.0            c         NaN
5        NaN            d         NaN
6        6.0            e          dd
7        7.0          NaN         NaN
8        NaN          NaN         NaN
9        NaN            f          ee

Let’s say that you want to count the NaN values across the row with the index of 7:

   first_set   second_set   third_set
0        1.0            a          aa
1        2.0            b         NaN
2        3.0          NaN          bb
3        4.0          NaN          cc
4        5.0            c         NaN
5        NaN            d         NaN
6        6.0            e          dd
7        7.0          NaN         NaN
8        NaN          NaN         NaN
9        NaN            f          ee

You can then use the following syntax to achieve this goal:

df.loc[[7]].isna().sum().sum()

So the complete Python code would be:

import pandas as pd
import numpy as np

data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan],
        'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f'],
        'third_set':['aa',np.nan,'bb','cc',np.nan,np.nan,'dd',np.nan,np.nan,'ee']
        }

df = pd.DataFrame(data,columns=['first_set','second_set','third_set'])

count_nan = df.loc[[7]].isna().sum().sum()

print ('Count of NaN: ' + str(count_nan))

You’ll notice that the count of NaNs across the row with the index of 7 is two:

Count of NaN: 2

What if you used another index (rather than the default numeric index)?

For example, let’s change the index to the following:

index=['row_0','row_1','row_2','row_3','row_4','row_5','row_6','row_7','row_8','row_9']

Here is the code to create the DataFrame with the new index:

import pandas as pd
import numpy as np

data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan],
        'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f'],
        'third_set':['aa',np.nan,'bb','cc',np.nan,np.nan,'dd',np.nan,np.nan,'ee']
        }

df = pd.DataFrame(data,columns=['first_set','second_set','third_set'], index=['row_0','row_1','row_2','row_3','row_4','row_5','row_6','row_7','row_8','row_9'])

print (df)

You’ll now get the DataFrame with the new index on the left:

       first_set   second_set   third_set
row_0        1.0            a          aa
row_1        2.0            b         NaN
row_2        3.0          NaN          bb
row_3        4.0          NaN          cc
row_4        5.0            c         NaN
row_5        NaN            d         NaN
row_6        6.0            e          dd
row_7        7.0          NaN         NaN
row_8        NaN          NaN         NaN
row_9        NaN            f          ee

Suppose that you want to count the NaNs across the row with the index of ‘row_7’.

In that case, you’ll need to modify the code to include the new index value:

count_nan = df.loc[['row_7']].isna().sum().sum()

So the complete Python code is:

import pandas as pd
import numpy as np

data = {'first_set': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan],
        'second_set': ['a','b',np.nan,np.nan,'c','d','e',np.nan,np.nan,'f'],
        'third_set':['aa',np.nan,'bb','cc',np.nan,np.nan,'dd',np.nan,np.nan,'ee']
        }

df = pd.DataFrame(data,columns=['first_set','second_set','third_set'], index=['row_0','row_1','row_2','row_3','row_4','row_5','row_6','row_7','row_8','row_9'])

count_nan = df.loc[['row_7']].isna().sum().sum()

print ('Count of NaN: ' + str(count_nan))

You’ll now get the count of NaNs associated with the row that has the index of ‘row_7’:

Count of NaN: 2

Additional Recourses

You may check the Pandas Documentation for additional information about isna.