How to Convert Strings to Integers in Pandas DataFrame

In this guide, you’ll see two approaches to convert strings into integers in Pandas DataFrame:

(1) The astype(int) approach:

df['DataFrame Column'] = df['DataFrame Column'].astype(int)

(2) The to_numeric approach:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'])

Let’s now review few examples with the steps to convert strings into integers.

Steps to Convert Strings to Integers in Pandas DataFrame

Step 1: Create a DataFrame

To start, let’s say that you want to create a DataFrame for the following data:

Product Price
AAA 210
BBB 250

You can capture the values under the Price column as strings by placing those values within quotes.

This is how the DataFrame would look like in Python:

import pandas as pd

data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(data)
print (df)
print (df.dtypes)

When you run the code, you’ll notice that indeed the values under the Price column are strings (where the data type is object):

  Product Price
0     AAA   210
1     BBB   250
Product    object
Price      object

Step 2: Convert the Strings to Integers in Pandas DataFrame

Now how do you convert those strings values into integers?

You may use the first approach of astype(int) to perform the conversion:

df['DataFrame Column'] = df['DataFrame Column'].astype(int)

Since in our example the ‘DataFrame Column’ is the Price column (which contains the strings values), you’ll then need to add the following syntax:

df['Price'] = df['Price'].astype(int)

So this is the complete Python code that you may apply to convert the strings into integers in Pandas DataFrame:

import pandas as pd

data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(data)
df['Price'] = df['Price'].astype(int)

print (df)
print (df.dtypes)

As you can see, the values under the Price column are now integers:

  Product  Price
0     AAA    210
1     BBB    250
Product    object
Price       int32

Step 3 (optional): Convert the Strings to Integers using to_numeric

For this optional step, you may use the second approach of to_numeric to convert the strings to integers:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'])

And this is the complete Python code to perform the conversion:

import pandas as pd

data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(data)
df['Price'] = pd.to_numeric(df['Price'])

print (df)
print (df.dtypes)

You’ll now see that the values under the Price column are indeed integers:

  Product  Price
0     AAA    210
1     BBB    250
Product    object
Price       int64

What if your column contains a combination of numeric and non-numeric values?

For example, in the DataFrame below, there are both numeric and non-numeric values under the Price column:

Product Price
AAA 210
BBB 250
CCC 22XYZ

In that case, you can still use to_numeric in order to convert the strings:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'], errors='coerce')

By setting errors=’coerce’, you’ll transform the non-numeric values into NaN.

Here is the Python code:

import pandas as pd

data = {'Product': ['AAA','BBB','CCC'],
          'Price': ['210','250','22XYZ']}

df = pd.DataFrame(data)
df['Price'] = pd.to_numeric(df['Price'],errors='coerce')

print (df)
print (df.dtypes)

You’ll now notice the NaN value, where the data type is float:

  Product  Price
0     AAA  210.0
1     BBB  250.0
2     CCC    NaN
Product     object
Price      float64

You can take things further by replacing the ‘NaN’ values with ‘0’ values using df.replace:

import pandas as pd
import numpy as np

data = {'Product': ['AAA','BBB','CCC'],
          'Price': ['210','250','22XYZ']}

df = pd.DataFrame(data)
df['Price'] = pd.to_numeric(df['Price'],errors='coerce')
df = df.replace(np.nan, 0, regex=True)
df['Price'] = df['Price'].astype(int)

print (df)
print (df.dtypes)

When you run the code, you’ll get a ‘0’ value instead of the NaN value, as well as the data type of integer:

  Product  Price
0     AAA    210
1     BBB    250
2     CCC      0
Product    object
Price       int32