You can convert floats to integers in Pandas DataFrame using:
(1) astype(int):
df["DataFrame Column"] = df["DataFrame Column"].astype(int)
(2) apply(int):
df["DataFrame Column"] = df["DataFrame Column"].apply(int)
In this guide, you’ll see 4 scenarios of converting floats to integers for:
- Specific DataFrame column using astype(int) or apply(int)
- Entire DataFrame where the data type of all columns is float
- Mixed DataFrame where the data type of some columns is float
- DataFrame that contains NaN values
4 Scenarios of Converting Floats to Integers in Pandas DataFrame
(1) Convert floats to integers for a specific DataFrame column
To start with a simple example, let’s create a DataFrame with two columns, where:
- The first column (called ‘numeric_values‘) will contain only floats
- The second column (called ‘string_values‘) will contain only strings
The goal is to convert all the floats to integers under the first DataFrame column.
Here is the code to create the DataFrame:
import pandas as pd data = { "numeric_values": [3.0, 5.0, 7.0, 15.995, 225.12], "string_values": ["AA", "BB", "CCC", "DD", "EEEE"], } df = pd.DataFrame(data) print(df) print(df.dtypes)
As you can see, the data type for the ‘numeric_values‘ column is float:
numeric_values string_values
0 3.000 AA
1 5.000 BB
2 7.000 CCC
3 15.995 DD
4 225.120 EEEE
numeric_values float64
string_values object
dtype: object
You can then use astype(int) in order to convert the floats to integers:
df["DataFrame Column"] = df["DataFrame Column"].astype(int)
So the complete code to perform the conversion is as follows:
import pandas as pd data = { "numeric_values": [3.0, 5.0, 7.0, 15.995, 225.12], "string_values": ["AA", "BB", "CCC", "DD", "EEEE"], } df = pd.DataFrame(data) df["numeric_values"] = df["numeric_values"].astype(int) print(df) print(df.dtypes)
You’ll now notice that the data type for the ‘numeric_values‘ column is integer:
numeric_values string_values
0 3 AA
1 5 BB
2 7 CCC
3 15 DD
4 225 EEEE
numeric_values int32
string_values object
dtype: object
Alternatively, you can use apply(int) to convert the floats to integers:
df["DataFrame Column"] = df["DataFrame Column"].apply(int)
For our example:
import pandas as pd data = { "numeric_values": [3.0, 5.0, 7.0, 15.995, 225.12], "string_values": ["AA", "BB", "CCC", "DD", "EEEE"], } df = pd.DataFrame(data) df["numeric_values"] = df["numeric_values"].apply(int) print(df) print(df.dtypes)
You’ll get the data type of integer:
numeric_values string_values
0 3 AA
1 5 BB
2 7 CCC
3 15 DD
4 225 EEEE
numeric_values int64
string_values object
dtype: object
(2) Convert an entire DataFrame where the data type of all columns is float
What if you have a DataFrame where the data type of all the columns is float?
Rather than specifying the conversion to integers column-by-column, you can do it instead on a DataFrame level using:
df = df.astype(int)
For example, let’s create a new DataFrame with two columns that contain only floats:
import pandas as pd data = { "numeric_values_1": [3.2, 5.9, 7.0, 15.995, 225.12], "numeric_values_2": [7.7, 23.0, 522.0, 4275.5, 22.3], } df = pd.DataFrame(data) print(df) print(df.dtypes)
You’ll now get this DataFrame with the two float columns:
numeric_values_1 numeric_values_2
0 3.200 7.7
1 5.900 23.0
2 7.000 522.0
3 15.995 4275.5
4 225.120 22.3
numeric_values_1 float64
numeric_values_2 float64
dtype: object
To convert the floats to integers throughout the entire DataFrame, you’ll need to add df = df.astype(int) to the code:
import pandas as pd data = { "numeric_values_1": [3.2, 5.9, 7.0, 15.995, 225.12], "numeric_values_2": [7.7, 23.0, 522.0, 4275.5, 22.3], } df = pd.DataFrame(data) df = df.astype(int) print(df) print(df.dtypes)
As you can see, all the columns in the DataFrame are now converted to integers:
numeric_values_1 numeric_values_2
0 3 7
1 5 23
2 7 522
3 15 4275
4 225 22
numeric_values_1 int32
numeric_values_2 int32
dtype: object
Note that the above approach would only work if all the columns in the DataFrame have the data type of float.
What if you have a mixed DataFrame where the data type of some (but not all) columns is float?
The section below deals with this scenario.
(3) Convert a mixed DataFrame where the data type of some columns is float
Let’s now create a new DataFrame with 3 columns, where the first 2 columns will contain float values, while the third column will include only strings:
import pandas as pd data = { "numeric_values_1": [3.2, 5.9, 7.0, 15.995, 225.12], "numeric_values_2": [7.7, 23.0, 522.0, 4275.5, 22.3], "string_values": ["AA", "BB", "CCC", "DD", "EEEE"], } df = pd.DataFrame(data) print(df) print(df.dtypes)
Here is the DataFrame with the 3 columns that you’ll get:
numeric_values_1 numeric_values_2 string_values
0 3.200 7.7 AA
1 5.900 23.0 BB
2 7.000 522.0 CCC
3 15.995 4275.5 DD
4 225.120 22.3 EEEE
numeric_values_1 float64
numeric_values_2 float64
string_values object
dtype: object
You can then specify multiple columns (in this example, the first two columns) that you’d like to convert to integers:
import pandas as pd data = { "numeric_values_1": [3.2, 5.9, 7.0, 15.995, 225.12], "numeric_values_2": [7.7, 23.0, 522.0, 4275.5, 22.3], "string_values": ["AA", "BB", "CCC", "DD", "EEEE"], } df = pd.DataFrame(data) df[["numeric_values_1", "numeric_values_2"]] = df[ ["numeric_values_1", "numeric_values_2"] ].astype(int) print(df) print(df.dtypes)
As you may observe, the first 2 columns are now converted to integers:
numeric_values_1 numeric_values_2 string_values
0 3 7 AA
1 5 23 BB
2 7 522 CCC
3 15 4275 DD
4 225 22 EEEE
numeric_values_1 int32
numeric_values_2 int32
string_values object
dtype: object
(4) Convert a DataFrame that contains NaN values
In the final scenario, you’ll see how to convert a column that includes a mixture of floats and NaN values.
The goal is to convert the float values to integers, as well as replace the NaN values with zeros.
Here is the code to create the DataFrame:
import pandas as pd import numpy as np data = {"numeric_values": [3.0, 5.0, np.nan, 15.0, np.nan]} df = pd.DataFrame(data) print(df) print(df.dtypes)
You’ll get this DataFrame that contains both floats and NaNs:
numeric_values
0 3.0
1 5.0
2 NaN
3 15.0
4 NaN
numeric_values float64
dtype: object
You can then replace the NaN values with zeros by adding fillna(0), and then perform the conversion to integers using astype(int):
import pandas as pd import numpy as np data = {"numeric_values": [3.0, 5.0, np.nan, 15.0, np.nan]} df = pd.DataFrame(data) df["numeric_values"] = df["numeric_values"].fillna(0).astype(int) print(df) print(df.dtypes)
Here is the newly converted DataFrame:
numeric_values
0 3
1 5
2 0
3 15
4 0
numeric_values int32
dtype: object
Additional Resources
You can check the Pandas Documentation to read more about astype.
Alternatively, you may review the following guides for other types of conversions: