You can convert floats to integers in Pandas DataFrame using:
(1) astype(int):
df['DataFrame Column'] = df['DataFrame Column'].astype(int)
(2) apply(int):
df['DataFrame Column'] = df['DataFrame Column'].apply(int)
In this guide, you’ll see 4 scenarios of converting floats to integers for:
- Specific DataFrame column using astype(int) or apply(int)
- Entire DataFrame where the data type of all columns is float
- Mixed DataFrame where the data type of some columns is float
- DataFrame that contains NaN values
4 Scenarios of Converting Floats to Integers in Pandas DataFrame
(1) Convert floats to integers for a specific DataFrame column
To start with a simple example, let’s create a DataFrame with two columns, where:
- The first column (called ‘numeric_values‘) will contain only floats
- The second column (called ‘string_values‘) will contain only strings
The goal is to convert all the floats to integers under the first DataFrame column.
Here is the code to create the DataFrame:
import pandas as pd data = {'numeric_values': [3.0, 5.0, 7.0, 15.995, 225.12], 'string_values': ['AA','BB','CCC','DD','EEEE'] } df = pd.DataFrame(data,columns=['numeric_values','string_values']) print(df) print(df.dtypes)
As you can see, the data type of the ‘numeric_values’ column is float:
You can then use astype(int) in order to convert the floats to integers:
df['DataFrame Column'] = df['DataFrame Column'].astype(int)
So the complete code to perform the conversion is as follows:
import pandas as pd data = {'numeric_values': [3.0, 5.0, 7.0, 15.995, 225.12], 'string_values': ['AA','BB','CCC','DD','EEEE'] } df = pd.DataFrame(data,columns=['numeric_values','string_values']) df['numeric_values'] = df['numeric_values'].astype(int) print(df) print(df.dtypes)
You’ll now notice that the data type of the ‘numeric_values’ column is integer:
Alternatively, you can use apply(int) to convert the floats to integers:
df['DataFrame Column'] = df['DataFrame Column'].apply(int)
For our example:
import pandas as pd data = {'numeric_values': [3.0, 5.0, 7.0, 15.995, 225.12], 'string_values': ['AA','BB','CCC','DD','EEEE'] } df = pd.DataFrame(data,columns=['numeric_values','string_values']) df['numeric_values'] = df['numeric_values'].apply(int) print(df) print(df.dtypes)
You’ll get the data type of integer:
(2) Convert an entire DataFrame where the data type of all columns is float
What if you have a DataFrame where the data type of all the columns is float?
Rather than specifying the conversion to integers column-by-column, you can do it instead on the DataFrame level using:
df = df.astype(int)
For example, let’s create a new DataFrame with two columns that contain only floats:
import pandas as pd data = {'numeric_values_1': [3.2, 5.9, 7.0, 15.995, 225.12], 'numeric_values_2': [7.7, 23.0, 522.0, 4275.5, 22.3] } df = pd.DataFrame(data,columns=['numeric_values_1','numeric_values_2']) print(df) print(df.dtypes)
You’ll now get this DataFrame with the two float columns:
To convert the floats to integers throughout the entire DataFrame, you’ll need to add df = df.astype(int) to the code:
import pandas as pd data = {'numeric_values_1': [3.2, 5.9, 7.0, 15.995, 225.12], 'numeric_values_2': [7.7, 23.0, 522.0, 4275.5, 22.3] } df = pd.DataFrame(data,columns=['numeric_values_1','numeric_values_2']) df = df.astype(int) print(df) print(df.dtypes)
As you can see, all the columns in the DataFrame are now converted to integers:
Note that the above approach would only work if all the columns in the DataFrame have the data type of float.
What if you have a mixed DataFrame where the data type of some (but not all) columns is float?
The section below deals with this scenario.
(3) Convert a mixed DataFrame where the data type of some columns is float
Let’s now create a new DataFrame with 3 columns, where the first 2 columns will contain float values, while the third column will include only strings:
import pandas as pd data = {'numeric_values_1': [3.2, 5.9, 7.0, 15.995, 225.12], 'numeric_values_2': [7.7, 23.0, 522.0, 4275.5, 22.3], 'string_values':['AA','BB','CCC','DD','EEEE'] } df = pd.DataFrame(data,columns=['numeric_values_1','numeric_values_2','string_values']) print(df) print(df.dtypes)
Here is the DataFrame with the 3 columns that you’ll get:
You can then specify multiple columns (in this example, the first two columns) that you’d like to convert to integers:
import pandas as pd data = {'numeric_values_1': [3.2, 5.9, 7.0, 15.995, 225.12], 'numeric_values_2': [7.7, 23.0, 522.0, 4275.5, 22.3], 'string_values':['AA','BB','CCC','DD','EEEE'] } df = pd.DataFrame(data,columns=['numeric_values_1','numeric_values_2','string_values']) df[['numeric_values_1','numeric_values_2']] = df[['numeric_values_1','numeric_values_2']].astype(int) print(df) print(df.dtypes)
As you can observe, the first 2 columns will now get converted to integers:
(4) Convert a DataFrame that contains NaN values
In the final scenario, you’ll see how to convert a column that includes a mixture of floats and NaN values.
The goal is to convert the float values to integers, as well as replace the NaN values with zeros.
Here is the code to create the DataFrame:
import pandas as pd import numpy as np data = {'numeric_values': [3.0, 5.0, np.nan, 15.0, np.nan] } df = pd.DataFrame(data,columns=['numeric_values']) print(df) print(df.dtypes)
You’ll get this DataFrame that contains both floats and NaN values:
You can then replace the NaN values with zeros by adding fillna(0), and then perform the conversion to integers using astype(int):
import pandas as pd import numpy as np data = {'numeric_values': [3.0, 5.0, np.nan, 15.0, np.nan] } df = pd.DataFrame(data,columns=['numeric_values']) df['numeric_values'] = df['numeric_values'].fillna(0).astype(int) print(df) print(df.dtypes)
Here is the newly converted DataFrame:
Additional Resources
You can check the Pandas Documentation to read more about astype.
Alternatively, you may review the following guides for other types of conversions: