How to Drop Rows with NaN Values in Pandas DataFrame

To drop rows with NaN (null) values in Pandas DataFrame:

df.dropna()

To drop rows where all the values are NaN:

df.dropna(how="all")

Steps to Drop Rows with NaN Values in Pandas DataFrame

Step 1: Create a DataFrame with NaN Values

Create a DataFrame with NaN values:

import pandas as pd
import numpy as np

data = {"col_a": [1, 2, np.nan, 4],
        "col_b": [5, np.nan, np.nan, 8],
        "col_c": [9, 10, 11, 12]
        }

df = pd.DataFrame(data)

print(df)

As can be observed, the second and third rows now have NaN values:

   col_a  col_b  col_c
0    1.0    5.0      9
1    2.0    NaN     10
2    NaN    NaN     11
3    4.0    8.0     12

Step 2: Drop the Rows with the NaN Values in Pandas DataFrame

Use df.dropna() to drop all the rows with the NaN values in the DataFrame:

import pandas as pd
import numpy as np

data = {"col_a": [1, 2, np.nan, 4],
        "col_b": [5, np.nan, np.nan, 8],
        "col_c": [9, 10, 11, 12]
        }

df = pd.DataFrame(data)

df_dropped = df.dropna()

print(df_dropped)

There results are two rows without any NaN values:

   col_a  col_b  col_c
0    1.0    5.0      9
3    4.0    8.0     12

Noticed that those two rows no longer have a sequential index. It’s currently 0 and 3. You can then reset the index to start from 0 and increase sequentially.

Step 3 (Optional): Reset the Index

The general syntax to reset an index in Pandas DataFrame:

df.reset_index(drop=True)

The complete script to drop the rows with the NaN values, and then reset the index:

import pandas as pd
import numpy as np

data = {"col_a": [1, 2, np.nan, 4],
        "col_b": [5, np.nan, np.nan, 8],
        "col_c": [9, 10, 11, 12]
        }

df = pd.DataFrame(data)

df_dropped = df.dropna()

df_reset = df_dropped.reset_index(drop=True)

print(df_reset)

The index now starts from 0 and increases sequentially:

   col_a  col_b  col_c
0    1.0    5.0      9
1    4.0    8.0     12

Drop Rows Where all the Values are NaN

Here is an example of a DataFrame where all the values are NaN for the third row:

import pandas as pd
import numpy as np

data = {"col_a": [1, 2, np.nan, 4],
        "col_b": [5, np.nan, np.nan, 8],
        "col_c": [9, 10, np.nan, 12]
        }

df = pd.DataFrame(data)

print(df)

As can be seen, all the values are NaN for the third row:

   col_a  col_b  col_c
0    1.0    5.0    9.0
1    2.0    NaN   10.0
2    NaN    NaN    NaN
3    4.0    8.0   12.0

Use df.dropna(how=”all”) to drop only the row/s where all the values are NaN:

import pandas as pd
import numpy as np

data = {"col_a": [1, 2, np.nan, 4],
        "col_b": [5, np.nan, np.nan, 8],
        "col_c": [9, 10, np.nan, 12]
        }

df = pd.DataFrame(data)

df_dropped = df.dropna(how="all")

print(df_dropped)

The result:

   col_a  col_b  col_c
0    1.0    5.0    9.0
1    2.0    NaN   10.0
3    4.0    8.0   12.0

Leave a Comment