3 Ways to Create NaN Values in Pandas DataFrame

In this article, you’ll see 3 ways to create NaN values in Pandas DataFrame:

  1. Using Numpy
  2. Importing a file with blank values
  3. Applying to_numeric

3 Ways to Create NaN Values in Pandas DataFrame

(1) Using Numpy

You can easily create NaN values in Pandas DataFrame using Numpy.

More specifically, you can place np.nan each time you want to add a NaN value in the DataFrame.

For example, in the code below, there are 4 instances of np.nan under a single DataFrame column:

import pandas as pd
import numpy as np

data = {
"set_of_numbers": [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan]
}

df = pd.DataFrame(data)

print(df)

This would result in 4 NaN values in the DataFrame:

    set_of_numbers
0              1.0
1              2.0
2              3.0
3              4.0
4              5.0
5              NaN
6              6.0
7              7.0
8              NaN
9              NaN
10             8.0
11             9.0
12            10.0
13             NaN

Similarly, you can place np.nan across multiple columns in the DataFrame:

import pandas as pd
import numpy as np

data = {"first_set_of_numbers": [1, 2, 3, 4, 5, np.nan, 6, 7, np.nan, np.nan, 8, 9, 10, np.nan],
"second_set_of_numbers": [11, 12, np.nan, 13, 14, np.nan, 15, 16, np.nan, np.nan, 17, np.nan, 19, np.nan],
"third_set_of_numbers": [20, 21, 22, 23, np.nan, 24, np.nan, 26, 27, np.nan, np.nan, 28, 29, 30]
}

df = pd.DataFrame(data)

print(df)

Now you’ll see 14 instances of NaN values across multiple columns in the DataFrame:

    first_set_of_numbers  second_set_of_numbers  third_set_of_numbers
0                    1.0                   11.0                  20.0
1                    2.0                   12.0                  21.0
2                    3.0                    NaN                  22.0
3                    4.0                   13.0                  23.0
4                    5.0                   14.0                   NaN
5                    NaN                    NaN                  24.0
6                    6.0                   15.0                   NaN
7                    7.0                   16.0                  26.0
8                    NaN                    NaN                  27.0
9                    NaN                    NaN                   NaN
10                   8.0                   17.0                   NaN
11                   9.0                    NaN                  28.0
12                  10.0                   19.0                  29.0
13                   NaN                    NaN                  30.0

(2) Importing a file with blank values

If you import a file using Pandas, and that file contains blank values, then you’ll get NaN values for those blank instances.

Here, let’s import a CSV file using Pandas, where some values are blank in the file itself:

ProductPrice
Desktop Computer700
Tablet 
 500
Laptop1200

For demonstration purposes, let’s suppose that the CSV file is stored under the following path:

C:\Users\Ron\Desktop\Products.csv

In that case, the syntax to import the CSV file is as follows (note that you’ll need to modify the path to reflect the location where the file is stored on your computer):

import pandas as pd

df = pd.read_csv(r"C:\Users\Ron\Desktop\Products.csv")

print(df)

Here, you’ll see two NaN values for those two blank instances:

            Product   Price
0  Desktop Computer   700.0
1            Tablet     NaN
2               NaN   500.0
3            Laptop  1200.0

(3) Applying to_numeric

Let’s now create a new DataFrame with a single column. Only this time, the values under the column would contain a combination of both numeric and non-numeric data:

set_of_numbers
1
2
AAA
3
BBB
4

This is how the DataFrame would look like:

import pandas as pd

data = {"set_of_numbers": [1, 2, "AAA", 3, "BBB", 4]}
df = pd.DataFrame(data)

print(df)

You’ll now see 6 values (4 numeric and 2 non-numeric):

  set_of_numbers
0              1
1              2
2            AAA
3              3
4            BBB
5              4

You can then use to_numeric in order to convert the values under the ‘set_of_numbers‘ column into a float format. But since 2 of those values are non-numeric, you’ll get NaN for those instances:

df["set_of_numbers"] = pd.to_numeric (df["set_of_numbers"], errors="coerce")

Here is the complete code:

import pandas as pd

data = {"set_of_numbers": [1, 2, "AAA", 3, "BBB", 4]}
df = pd.DataFrame(data)

df["set_of_numbers"] = pd.to_numeric (df["set_of_numbers"], errors="coerce")

print(df)

Notice that the two non-numeric values became NaN:

   set_of_numbers
0             1.0
1             2.0
2             NaN
3             3.0
4             NaN
5             4.0

You may also want to review the following guides that explain how to:

Leave a Comment