Generate Random Integers in Pandas Dataframe

In this short guide, you’ll see how to generate random integers in Pandas DataFrame under:

  • Single DataFrame column
  • Multiple DataFrame columns

You’ll also see how to convert those integers to different data types, such as floats or strings.

Generate Random Integers under a Single DataFrame Column

Here is a template that you may use to generate random integers under a single DataFrame column:

import numpy as np
import pandas as pd

data = np.random.randint(lowest integer, highest integer, size=number of random integers)

df = pd.DataFrame(data, columns=["column name"])

print(df)

For example, let’s say that you want to generate random integers given the following information:

  • The lowest integer is 5 (inclusive)
  • The highest integer is 30 (exclusive)
  • The size is 10

You may then apply this code in Python:

import numpy as np
import pandas as pd

data = np.random.randint(5, 30, size=10)

df = pd.DataFrame(data, columns=["random_numbers"])

print(df)

When you run the code, you’ll get 10 random integers (as specified by the size of 10):

   random_numbers
0              15
1               5
2              24
3              19
4              23
5              24
6              29
7              27
8              25
9              19

You may note that the lowest integer (e.g., 5 in the code above) may be included when generating the random integers, but the highest integer (e.g., 30 in the code above) will be excluded.

Generate Random Integers under Multiple DataFrame Columns

Here is a template to generate random integers under multiple DataFrame columns:

import pandas as pd

data = np.random.randint(lowest integer, highest integer, size=(number of random integers per column, number of columns))

df = pd.DataFrame(
data, columns=["column name 1", "column name 2", "column name 3", ...]
)

print(df)

For instance, you can apply the code below in order to create 3 columns with random integers:

import numpy as np
import pandas as pd

data = np.random.randint(5, 30, size=(10, 3))

df = pd.DataFrame(
data, columns=["random_numbers_1", "random_numbers_2", "random_numbers_3"]
)

print(df)

And here is the result:

   random_numbers_1  random_numbers_2  random_numbers_3
0                15                 5                12
1                27                16                 7
2                10                19                17
3                19                13                11
4                 5                29                 8
5                10                26                14
6                24                11                10
7                20                 5                10
8                18                28                25
9                13                22                27

Check the Data Type

You can check the data type in Pandas DataFrame by adding print(df.dtypes) at the bottom of the code:

import numpy as np
import pandas as pd

data = np.random.randint(5, 30, size=(10, 3))

df = pd.DataFrame(
data, columns=["random_numbers_1", "random_numbers_2", "random_numbers_3"]
)

print(df)
print(df.dtypes)

As you may observe, the data type for each of the 3 columns is integer:

   random_numbers_1  random_numbers_2  random_numbers_3
0                23                10                21
1                27                18                 7
2                11                27                14
3                17                29                21
4                27                15                16
5                10                20                23
6                14                16                20
7                21                25                10
8                 9                27                 6
9                15                26                10
random_numbers_1    int32
random_numbers_2    int32
random_numbers_3    int32

Convert the Data Type to Float

You can convert the integers to floats by applying astype(float) as follows:

import numpy as np
import pandas as pd

data = np.random.randint(5, 30, size=(10, 3))

df = pd.DataFrame(
data, columns=["random_numbers_1", "random_numbers_2", "random_numbers_3"]
).astype(float)

print(df)
print(df.dtypes)

You’ll see that the data type for each of the 3 columns is now float:

   random_numbers_1  random_numbers_2  random_numbers_3
0              14.0              19.0              26.0
1               8.0              14.0               8.0
2              19.0              22.0              28.0
3              24.0              10.0              29.0
4              11.0              26.0              12.0
5              13.0              12.0               8.0
6              26.0               7.0              11.0
7              22.0              24.0              23.0
8              28.0               8.0              18.0
9              13.0              27.0              26.0
random_numbers_1    float64
random_numbers_2    float64
random_numbers_3    float64

Convert the Data Type to String

Alternatively, you may convert the integers to strings using astype(str):

import numpy as np
import pandas as pd

data = np.random.randint(5, 30, size=(10, 3))

df = pd.DataFrame(
data, columns=["random_numbers_1", "random_numbers_2", "random_numbers_3"]
).astype(str)

print(df)
print(df.dtypes)

You’ll now get ‘object’ which represents strings:

  random_numbers_1 random_numbers_2 random_numbers_3
0               22                9               26
1                6               28               19
2               21               10               15
3               16               11               21
4               13               16               21
5                9               12               23
6               10                8               27
7                9               14                7
8               29               13                8
9               20                5               25
random_numbers_1    object
random_numbers_2    object
random_numbers_3    object

Check the numpy manual for further information about numpy.random.randint.