How to Convert Strings to Floats in Pandas DataFrame

Need to convert strings to floats in Pandas DataFrame?

Depending on the scenario, you may use either of the following two approaches in order to convert strings to floats in Pandas DataFrame:

(1) astype(float)

df['DataFrame Column'] = df['DataFrame Column'].astype(float)

(2) to_numeric

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'],errors='coerce')

In this short guide, you’ll see 3 scenarios with the steps to convert strings to floats:

  1. For a column that contains numeric values stored as strings
  2. For a column that contains both numeric and non-numeric values
  3. For an entire DataFrame

Scenarios to Convert Strings to Floats in Pandas DataFrame

Scenario 1: Numeric values stored as strings

To keep things simple, let’s create a DataFrame with only two columns:

Product Price
ABC 250
XYZ 270

Below is the code to create the DataFrame in Python, where the values under the ‘Price’ column are stored as strings (by using single quotes around those values. Note that the same concepts would apply by using double quotes):

import pandas as pd

data = {'Product': ['ABC','XYZ'],
          'Price': ['250','270']
        }

df = pd.DataFrame(data)
print (df)
print (df.dtypes)

Run the code in Python, and you’ll see that the data type for the ‘Price’ column is Object:

  Product Price
0     ABC   250
1     XYZ   270
Product    object
Price      object
dtype: object

The goal is to convert the values under the ‘Price’ column into floats.

You can then use the astype(float) approach to perform the conversion into floats:

df['DataFrame Column'] = df['DataFrame Column'].astype(float)

In the context of our example, the ‘DataFrame Column’ is the ‘Price’ column. And so, the full code to convert the values to floats would be:

import pandas as pd

data = {'Product': ['ABC','XYZ'],
          'Price': ['250','270']
        }

df = pd.DataFrame(data)
df['Price'] = df['Price'].astype(float)

print (df)
print (df.dtypes)

You’ll now see that the ‘Price’ column has been converted into a float:

  Product  Price
0     ABC  250.0
1     XYZ  270.0
Product     object
Price      float64
dtype: object

Scenario 2: Numeric and non-numeric values

Let’s create a new DataFrame with two columns (the ‘Product’ and the ‘Price’ columns). Only this time, the values under the ‘Price’ column would contain a combination of both numeric and non-numeric data:

Product Price
AAA 250
BBB ABC260
CCC 270
DDD 280XYZ

This is how the DataFrame would look like in Python:

import pandas as pd

data = {'Product': ['AAA','BBB','CCC','DDD'],
          'Price': ['250','ABC260','270','280XYZ']
        }

df = pd.DataFrame(data)

print (df)
print(df.dtypes)

As before, the data type for the ‘Price’ column is Object:

  Product   Price
0     AAA     250
1     BBB  ABC260
2     CCC     270
3     DDD  280XYZ
Product    object
Price      object
dtype: object

You can then use the to_numeric approach in order to convert the values under the ‘Price’ column into floats:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'], errors='coerce')

By setting errors=’coerce’, you’ll transform the non-numeric values into NaN.

Here it the complete code that you can use:

import pandas as pd

data = {'Product': ['AAA','BBB','CCC','DDD'],
          'Price': ['250','ABC260','270','280XYZ']
        }

df = pd.DataFrame(data)
df['Price'] = pd.to_numeric(df['Price'], errors='coerce')

print (df)
print(df.dtypes)

Run the code, and you’ll see that the ‘Price’ column is now a float:

  Product  Price
0     AAA  250.0
1     BBB    NaN
2     CCC  270.0
3     DDD    NaN
Product     object
Price      float64
dtype: object

To take things further, you can even replace the ‘NaN’ values with ‘0’ values by using df.replace:

import pandas as pd
import numpy as np

data = {'Product': ['AAA','BBB','CCC','DDD'],
          'Price': ['250','ABC260','270','280XYZ']
        }
df = pd.DataFrame(data)
df ['Price'] = pd.to_numeric(df['Price'], errors='coerce')
df = df.replace(np.nan, 0, regex=True)

print (df)
print(df.dtypes)

And here is what you’ll get:

  Product  Price
0     AAA  250.0
1     BBB    0.0
2     CCC  270.0
3     DDD    0.0
Product     object
Price      float64
dtype: object

Scenario 3: Convert Strings to Floats under the Entire DataFrame

For the final scenario, let’s create a DataFrame with 3 columns, where all the values will be stored as strings (using single quotes):

import pandas as pd

data = {'Price_1': ['300','750','600','770','920'],
        'Price_2': ['250','270','950','580','410'],
        'Price_3': ['530','480','420','290','830']
        }

df = pd.DataFrame(data)

print (df)
print (df.dtypes)

As you can see, the data type of all the columns across the DataFrame is object:

  Price_1 Price_2 Price_3
0     300     250     530
1     750     270     480
2     600     950     420
3     770     580     290
4     920     410     830
Price_1    object
Price_2    object
Price_3    object
dtype: object

You can then add the following syntax to convert all the values into floats under the entire DataFrame:

df = df.astype(float)

So the complete Python code to perform the conversion would be:

import pandas as pd

data = {'Price_1': ['300','750','600','770','920'],
        'Price_2': ['250','270','950','580','410'],
        'Price_3': ['530','480','420','290','830']
        }

df = pd.DataFrame(data)
df = df.astype(float)

print (df)
print (df.dtypes)

All the columns under the entire DataFrame are now floats:

   Price_1  Price_2  Price_3
0    300.0    250.0    530.0
1    750.0    270.0    480.0
2    600.0    950.0    420.0
3    770.0    580.0    290.0
4    920.0    410.0    830.0
Price_1    float64
Price_2    float64
Price_3    float64
dtype: object

You may also want to check the following guides for additional conversions of: