In this article, you’ll see 3 ways to create NaN values in Pandas DataFrame:
- Using Numpy
- Importing a file with blank values
- Applying to_numeric
3 Ways to Create NaN Values in Pandas DataFrame
(1) Using Numpy
You can easily create NaN values in Pandas DataFrame using Numpy.
More specifically, you can place np.nan each time you want to add a NaN value in the DataFrame.
For example, in the code below, there are 4 instances of np.nan under a single DataFrame column:
import pandas as pd
import numpy as np
data = {
"set_of_numbers": [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan]
}
df = pd.DataFrame(data)
print(df)
This would result in 4 NaN values in the DataFrame:
set_of_numbers
0 1.0
1 2.0
2 3.0
3 4.0
4 5.0
5 NaN
6 6.0
7 7.0
8 NaN
9 NaN
10 8.0
11 9.0
12 10.0
13 NaN
Similarly, you can place np.nan across multiple columns in the DataFrame:
import pandas as pd
import numpy as np
data = {"first_set_of_numbers": [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan],
"second_set_of_numbers": [11, 12, np.nan, 13, 14, np.nan, 15, 16, np.nan, np.nan, 17, np.nan, 19, np.nan],
"third_set_of_numbers": [20, 21, 22, 23, np.nan, 24, np.nan, 26, 27, np.nan, np.nan, 28, 29, 30]
}
df = pd.DataFrame(data)
print(df)
Now you’ll see 14 instances of NaN values across multiple columns in the DataFrame:
first_set_of_numbers second_set_of_numbers third_set_of_numbers
0 1.0 11.0 20.0
1 2.0 12.0 21.0
2 3.0 NaN 22.0
3 4.0 13.0 23.0
4 5.0 14.0 NaN
5 NaN NaN 24.0
6 6.0 15.0 NaN
7 7.0 16.0 26.0
8 NaN NaN 27.0
9 NaN NaN NaN
10 8.0 17.0 NaN
11 9.0 NaN 28.0
12 10.0 19.0 29.0
13 NaN NaN 30.0
(2) Importing a file with blank values
If you import a file using Pandas, and that file contains blank values, then you’ll get NaN values for those blank instances.
Here, let’s import a CSV file using Pandas, where some values are blank in the file itself:
Product | Price |
Desktop Computer | 700 |
Tablet | |
500 | |
Laptop | 1200 |
For demonstration purposes, let’s suppose that the CSV file is stored under the following path:
In that case, the syntax to import the CSV file is as follows (note that you’ll need to modify the path to reflect the location where the file is stored on your computer):
import pandas as pd
df = pd.read_csv(r"C:\Users\Ron\Desktop\Products.csv")
print(df)
Here, you’ll see two NaN values for those two blank instances:
Product Price
0 Desktop Computer 700.0
1 Tablet NaN
2 NaN 500.0
3 Laptop 1200.0
(3) Applying to_numeric
Let’s now create a new DataFrame with a single column. Only this time, the values under the column would contain a combination of both numeric and non-numeric data:
set_of_numbers |
1 |
2 |
AAA |
3 |
BBB |
4 |
This is how the DataFrame would look like:
import pandas as pd
data = {"set_of_numbers": [1, 2, "AAA", 3, "BBB", 4]}
df = pd.DataFrame(data)
print(df)
You’ll now see 6 values (4 numeric and 2 non-numeric):
set_of_numbers
0 1
1 2
2 AAA
3 3
4 BBB
5 4
You can then use to_numeric in order to convert the values under the ‘set_of_numbers‘ column into a float format. But since 2 of those values are non-numeric, you’ll get NaN for those instances:
df["set_of_numbers"] = pd.to_numeric (df["set_of_numbers"], errors="coerce")
Here is the complete code:
import pandas as pd
data = {"set_of_numbers": [1, 2, "AAA", 3, "BBB", 4]}
df = pd.DataFrame(data)
df["set_of_numbers"] = pd.to_numeric (df["set_of_numbers"], errors="coerce")
print(df)
Notice that the two non-numeric values became NaN:
set_of_numbers
0 1.0
1 2.0
2 NaN
3 3.0
4 NaN
5 4.0
You may also want to review the following guides that explain how to: