How to Convert Strings to Floats in Pandas DataFrame

Need to convert strings to floats in pandas DataFrame?

Depending on the scenario, you may use either of the following two methods in order to convert strings to floats in pandas DataFrame:

(1) astype(float) method

df['DataFrame Column'] = df['DataFrame Column'].astype(float)

(2) to_numeric method

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'],errors='coerce')

Want to see how to apply those two methods in practice?

If so, in this tutorial, I’ll review 2 scenarios to demonstrate how to convert strings to floats:

(1) For a column that contains numeric values stored as strings; and
(2) For a column that contains both numeric and non-numeric values

Scenarios to Convert Strings to Floats in Pandas DataFrame

Scenario 1: Numeric values stored as strings

To keep things simple, let’s create a DataFrame with only two columns:

Product Price
ABC 250
XYZ 270

Below is the code to create the DataFrame in Python, where the values under the ‘Price’ column are stored as strings (by using single quotes around those values. Note that the same concepts would apply by using double quotes):

import pandas as pd

Data = {'Product': ['ABC','XYZ'],
          'Price': ['250','270']}

df = pd.DataFrame(Data)
print (df)
print (df.dtypes)

Run the code in Python and you would see that the data type for the ‘Price’ column is Object:

How to Convert Strings to Floats in Pandas DataFrame

The goal is to convert the values under the ‘Price’ column into a float.

You can then use the astype(float) method to perform the conversion into a float:

df['DataFrame Column'] = df['DataFrame Column'].astype(float)

In the context of our example, the ‘DataFrame Column’ is the ‘Price’ column. And so, the full code to convert the values into a float would be:

import pandas as pd

Data = {'Product': ['ABC','XYZ'],
          'Price': ['250','270']}

df = pd.DataFrame(Data)
df['Price'] = df['Price'].astype(float)

print (df)
print (df.dtypes)

You’ll now see that the Price column has been converted into a float:

astype float

Scenario 2: Numeric and non-numeric values

Let’s create a new DataFrame with two columns (the Product and Price columns). Only this time, the values under the Price column would contain a combination of both numeric and non-numeric data: 

Product Price
AAA250
BBBABC260
CCC270
DDD280XYZ

This is how the DataFrame would look like in Python:

import pandas as pd

Data = {'Product': ['AAA','BBB','CCC','DDD'],
          'Price': ['250','ABC260','270','280XYZ']}

df = pd.DataFrame(Data)

print (df)
print(df.dtypes)

As before, the data type for the Price column is Object:

Convert Strings to Floats in Pandas DataFrame

You can then use the to_numeric method in order to convert the values under the Price column into a float:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'], errors='coerce')

By setting errors=’coerce’, you’ll transform the non-numeric values into NaN.

Here it the complete code that you can use:

import pandas as pd

Data = {'Product': ['AAA','BBB','CCC','DDD'],
          'Price': ['250','ABC260','270','280XYZ']}

df = pd.DataFrame(Data)
df['Price'] = pd.to_numeric(df['Price'], errors='coerce')

print (df)
print(df.dtypes)

Run the code and you’ll see that the Price column is now a float:

to_numeric

To take things further, you can even replace the ‘NaN’ values with ‘0’ values by using df.replace:

import pandas as pd
import numpy as np

Data = {'Product': ['AAA','BBB','CCC','DDD'],
          'Price': ['250','ABC260','270','280XYZ']}
df = pd.DataFrame(Data)
df ['Price'] = pd.to_numeric(df['Price'], errors='coerce')
df = df.replace(np.nan, 0, regex=True)

print (df)
print(df.dtypes)

And here is what you’ll get:

How to Convert Strings to Floats in Pandas DataFrame