How to Convert String to Integer in Pandas DataFrame

In this guide, I’ll show you two methods to convert a string into an integer in pandas DataFrame:

(1) The astype(int) method:

df['DataFrame Column'] = df['DataFrame Column'].astype(int)

(2) The to_numeric method:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'])

Let’s now review few examples with the steps to convert a string into an integer.

Steps to Convert String to Integer in Pandas DataFrame

Step 1: Create a DataFrame

To start, let’s say that you want to create a DataFrame for the following data:

ProductPrice
AAA210
BBB250

You can capture the values under the Price column as strings by surrounding those values with quotation marks.

This is how the DataFrame would look like in Python:

import pandas as pd

Data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(Data)
print (df)
print (df.dtypes)

When you run the code, you’ll notice that indeed the values under the Price column are strings (where the data type is object):

pandas DataFrame

Step 2: Convert the Strings to Integers in Pandas DataFrame

Now how do you convert those strings values into integers?

You may use the first method of astype(int) to perform the conversion:

df['DataFrame Column'] = df['DataFrame Column'].astype(int)

Since in our example the ‘DataFrame Column’ is the Price column (which contains the strings values), you’ll then need to add the following syntax:

df['Price'] = df['Price'].astype(int)

So this is the complete Python code that you may apply to convert the strings into integers in the pandas DataFrame:

import pandas as pd

Data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(Data)
df['Price'] = df['Price'].astype(int)

print (df)
print (df.dtypes)

As you can see, the values under the Price column are now integers:

How to Convert String to Integer in Pandas DataFrame

Step 3 (optional): Convert the Strings to Integers using to_numeric

For this optional step, you may use the second method of to_numeric to convert the strings to integers:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'])

And this is the complete Python code to perform the conversion:

import pandas as pd

Data = {'Product': ['AAA','BBB'],
          'Price': ['210','250']}

df = pd.DataFrame(Data)
df['Price'] = pd.to_numeric(df['Price'])

print (df)
print (df.dtypes)

You’ll now see that the values under the Price column are indeed integers:

Convert String to Integer in Pandas DataFrame

What if your column contains a combination of numeric and non-numeric values?

For example, in the DataFrame below, there are both numeric and non-numeric values under the Price column:

ProductPrice
AAA210
BBB250
CCC22XYZ

In that case, you can still use the to_numeric to convert the strings:

df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column'], errors='coerce')

By setting errors=’coerce’, you’ll transform the non-numeric values into NaN.

Here is the Python code:

import pandas as pd

Data = {'Product': ['AAA','BBB','CCC'],
          'Price': ['210','250','22XYZ']}

df = pd.DataFrame(Data)
df['Price'] = pd.to_numeric(df['Price'],errors='coerce')

print (df)
print (df.dtypes)

You’ll now notice the NaN value, where the data type is float:

NaN values in DataFrame

You can take things further by replacing the ‘NaN’ values with ‘0’ values using df.replace:

import pandas as pd
import numpy as np

Data = {'Product': ['AAA','BBB','CCC'],
          'Price': ['210','250','22XYZ']}

df = pd.DataFrame(Data)
df['Price'] = pd.to_numeric(df['Price'],errors='coerce')
df = df.replace(np.nan, 0, regex=True)
df['Price'] = df['Price'].astype(int)

print (df)
print (df.dtypes)

When you run the code, you’ll get a ‘0’ value instead of the NaN value, as well as the data type of integer:

pd.to_numeric