How to Convert String to Integer in Pandas DataFrame

In this guide, I’ll show you two methods to convert a string into an integer in pandas DataFrame:

(1) The astype(int) method:

df['DataFrame Column'] = df['DataFrame Column'].astype(int)

(2) The to_numeric method:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'])

Let’s now review few examples with the steps to convert a string into an integer.

Steps to Convert String to Integer in Pandas DataFrame

Step 1: Create a DataFrame

To start, let’s say that you want to create a DataFrame for the following data:

ProductPrice
AAA210
BBB250

You can capture the values under the Price column as strings by placing those values within quotes.

This is how the DataFrame would look like in Python:

import pandas as pd

Data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(Data)
print (df)
print (df.dtypes)

When you run the code, you’ll notice that indeed the values under the Price column are strings (where the data type is object):

pandas DataFrame

Step 2: Convert the Strings to Integers in Pandas DataFrame

Now how do you convert those strings values into integers?

You may use the first method of astype(int) to perform the conversion:

df['DataFrame Column'] = df['DataFrame Column'].astype(int)

Since in our example the ‘DataFrame Column’ is the Price column (which contains the strings values), you’ll then need to add the following syntax:

df['Price'] = df['Price'].astype(int)

So this is the complete Python code that you may apply to convert the strings into integers in the pandas DataFrame:

import pandas as pd

Data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(Data)
df['Price'] = df['Price'].astype(int)

print (df)
print (df.dtypes)

As you can see, the values under the Price column are now integers:

How to Convert String to Integer in Pandas DataFrame

Step 3 (optional): Convert the Strings to Integers using to_numeric

For this optional step, you may use the second method of to_numeric to convert the strings to integers:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'])

And this is the complete Python code to perform the conversion:

import pandas as pd

Data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(Data)
df['Price'] = pd.to_numeric(df['Price'])

print (df)
print (df.dtypes)

You’ll now see that the values under the Price column are indeed integers:

Convert String to Integer in Pandas DataFrame

What if your column contains a combination of numeric and non-numeric values?

For example, in the DataFrame below, there are both numeric and non-numeric values under the Price column:

ProductPrice
AAA210
BBB250
CCC22XYZ

In that case, you can still use to_numeric in order to convert the strings:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'], errors='coerce')

By setting errors=’coerce’, you’ll transform the non-numeric values into NaN.

Here is the Python code:

import pandas as pd

Data = {'Product': ['AAA','BBB','CCC'],
          'Price': ['210','250','22XYZ']}

df = pd.DataFrame(Data)
df['Price'] = pd.to_numeric(df['Price'],errors='coerce')

print (df)
print (df.dtypes)

You’ll now notice the NaN value, where the data type is float:

NaN values in DataFrame

You can take things further by replacing the ‘NaN’ values with ‘0’ values using df.replace:

import pandas as pd
import numpy as np

Data = {'Product': ['AAA','BBB','CCC'],
          'Price': ['210','250','22XYZ']}

df = pd.DataFrame(Data)
df['Price'] = pd.to_numeric(df['Price'],errors='coerce')
df = df.replace(np.nan, 0, regex=True)
df['Price'] = df['Price'].astype(int)

print (df)
print (df.dtypes)

When you run the code, you’ll get a ‘0’ value instead of the NaN value, as well as the data type of integer:

pd.to_numeric