To drop rows with NaN (null) values in Pandas DataFrame:
df.dropna()
To drop rows where all the values are NaN:
df.dropna(how="all")
Steps to Drop Rows with NaN Values in Pandas DataFrame
Step 1: Create a DataFrame with NaN Values
Create a DataFrame with NaN values:
import pandas as pd import numpy as np data = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, 11, 12] } df = pd.DataFrame(data) print(df)
As can be observed, the second and third rows now have NaN values:
col_a col_b col_c
0 1.0 5.0 9
1 2.0 NaN 10
2 NaN NaN 11
3 4.0 8.0 12
Step 2: Drop the Rows with the NaN Values in Pandas DataFrame
Use df.dropna() to drop all the rows with the NaN values in the DataFrame:
import pandas as pd import numpy as np data = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, 11, 12] } df = pd.DataFrame(data) df_dropped = df.dropna() print(df_dropped)
There results are two rows without any NaN values:
col_a col_b col_c
0 1.0 5.0 9
3 4.0 8.0 12
Noticed that those two rows no longer have a sequential index. It’s currently 0 and 3. You can then reset the index to start from 0 and increase sequentially.
Step 3 (Optional): Reset the Index
The general syntax to reset an index in Pandas DataFrame:
df.reset_index(drop=True)
The complete script to drop the rows with the NaN values, and then reset the index:
import pandas as pd import numpy as np data = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, 11, 12] } df = pd.DataFrame(data) df_dropped = df.dropna() df_reset = df_dropped.reset_index(drop=True) print(df_reset)
The index now starts from 0 and increases sequentially:
col_a col_b col_c
0 1.0 5.0 9
1 4.0 8.0 12
Drop Rows Where all the Values are NaN
Here is an example of a DataFrame where all the values are NaN for the third row:
import pandas as pd import numpy as np data = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, np.nan, 12] } df = pd.DataFrame(data) print(df)
As can be seen, all the values are NaN for the third row:
col_a col_b col_c
0 1.0 5.0 9.0
1 2.0 NaN 10.0
2 NaN NaN NaN
3 4.0 8.0 12.0
Use df.dropna(how=”all”) to drop only the row/s where all the values are NaN:
import pandas as pd import numpy as np data = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, np.nan, 12] } df = pd.DataFrame(data) df_dropped = df.dropna(how="all") print(df_dropped)
The result:
col_a col_b col_c
0 1.0 5.0 9.0
1 2.0 NaN 10.0
3 4.0 8.0 12.0