In this article, you’ll see 3 ways to create NaN values in Pandas DataFrame:
- Using Numpy
- Importing a file with blank values
- Applying to_numeric
3 Ways to Create NaN Values in Pandas DataFrame
(1) Using Numpy
You can easily create NaN values in Pandas DataFrame by using Numpy.
More specifically, you can insert np.nan each time you want to add a NaN value into the DataFrame.
For example, in the code below, there are 4 instances of np.nan under a single DataFrame column:
import pandas as pd import numpy as np data = {'set_of_numbers': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan,8,9,10,np.nan]} df = pd.DataFrame(data,columns=['set_of_numbers']) print (df)
This would result in 4 NaN values in the DataFrame:
Similarly, you can insert np.nan across multiple columns in the DataFrame:
import pandas as pd import numpy as np data = {'first_set_of_numbers': [1,2,3,4,5,np.nan,6,7,np.nan,np.nan,8,9,10,np.nan], 'second_set_of_numbers': [11,12,np.nan,13,14,np.nan,15,16,np.nan,np.nan,17,np.nan,19,np.nan], 'third_set_of_numbers': [20,21,22,23,np.nan,24,np.nan,26,27,np.nan,np.nan,28,29,30] } df = pd.DataFrame(data,columns=['first_set_of_numbers','second_set_of_numbers','third_set_of_numbers']) print (df)
Now you’ll see 14 instances of NaN across multiple columns in the DataFrame:
(2) Importing a file with blank values
If you import a file using Pandas, and that file contains blank values, then you’ll get NaN values for those blank instances.
Here, I imported a CSV file using Pandas, where some values were blank in the file itself:
This is the syntax that I used to import the file:
import pandas as pd df = pd.read_csv (r'C:\Users\Ron\Desktop\Products.csv') print (df)
I then got two NaN values for those two blank instances:
(3) Applying to_numeric
Let’s now create a new DataFrame with a single column. Only this time, the values under the column would contain a combination of both numeric and non-numeric data:
set_of_numbers |
1 |
2 |
AAA |
3 |
BBB |
4 |
This is how the DataFrame would look like:
import pandas as pd data = {'set_of_numbers': [1,2,"AAA",3,"BBB",4]} df = pd.DataFrame(data,columns=['set_of_numbers']) print (df)
You’ll now see 6 values (4 numeric and 2 non-numeric):
You can then use to_numeric in order to convert the values under the ‘set_of_numbers’ column into a float format. But since 2 of those values are non-numeric, you’ll get NaN for those instances:
df['set_of_numbers'] = pd.to_numeric(df['set_of_numbers'], errors='coerce')
Here is the complete code:
import pandas as pd data = {'set_of_numbers': [1,2,"AAA",3,"BBB",4]} df = pd.DataFrame(data,columns=['set_of_numbers']) df['set_of_numbers'] = pd.to_numeric(df['set_of_numbers'], errors='coerce') print (df)
Notice that the two non-numeric values became NaN:
You may also want to review the following guides that explain how to: