How to Convert Pandas DataFrame to NumPy Array

Here are two approaches to convert Pandas DataFrame to a NumPy array:

(1) First approach:

df.to_numpy()

(2) Second approach:

df.values

Note that the recommended approach is df.to_numpy().

Steps to Convert Pandas DataFrame to NumPy Array

Step 1: Create a DataFrame

To start with a simple example, let’s create a DataFrame with 3 columns. The 3 columns will contain only numeric data (i.e., integers):

import pandas as pd

data = {'Age': [25,47,38],
        'Birth Year': [1995,1973,1982],
        'Graduation Year': [2016,2000,2005]
        }

df = pd.DataFrame(data, columns = ['Age','Birth Year','Graduation Year'])

print(df)
print(type(df))

Run the code, and you’ll get the following Pandas DataFrame:

Example of data

Step 2: Convert the DataFrame to NumPy Array

You can use the first approach of df.to_numpy() to convert the DataFrame to a NumPy array:

df.to_numpy()

Here is the complete code to perform the conversion:

import pandas as pd

data = {'Age': [25,47,38],
        'Birth Year': [1995,1973,1982],
        'Graduation Year': [2016,2000,2005]
        }

df = pd.DataFrame(data, columns = ['Age','Birth Year','Graduation Year'])

my_array = df.to_numpy()

print(my_array)
print(type(my_array))

As you can see, the DataFrame is now converted to a NumPy array:

How to Convert Pandas DataFrame to NumPy Array

Alternatively, you could use the second approach of df.values to convert the DataFrame to a NumPy array:

import pandas as pd

data = {'Age': [25,47,38],
        'Birth Year': [1995,1973,1982],
        'Graduation Year': [2016,2000,2005]
        }

df = pd.DataFrame(data, columns = ['Age','Birth Year','Graduation Year'])

my_array = df.values

print(my_array)
print(type(my_array))

You’ll get the same NumPy array:

How to Convert Pandas DataFrame to NumPy Array

Step 3 (optional step): Check the Data Type

Once you converted the DataFrame to an array, you can check the dtype by adding print(my_array.dtype) at the bottom of the code:

import pandas as pd

data = {'Age': [25,47,38],
        'Birth Year': [1995,1973,1982],
        'Graduation Year': [2016,2000,2005]
        }

df = pd.DataFrame(data, columns = ['Age','Birth Year','Graduation Year'])

my_array = df.to_numpy()

print(my_array)
print(type(my_array))
print(my_array.dtype)

For the above example, the dtype is integer (int64):

int64

Convert a DataFrame with Mix Data Types

What if you have a DataFrame with mixed data types (e.g., string/object and integer)?

For example, let’s create another DataFrame with a mixture of strings and numeric data:

import pandas as pd

data = {'Name': ['Jon','Maria','Bill'],
        'Age': [25,47,38],
        'Birth Year': [1995,1973,1982],
        'Graduation Year': [2016,2000,2005]
        }

df = pd.DataFrame(data, columns = ['Name','Age','Birth Year','Graduation Year'])

print(df)
print(type(df))

This is how the DataFrame would look like:

Convert Pandas DataFrame to NumPy Array

Let’s now convert the above DataFrame to a NumPy array, and then check the dtype:

import pandas as pd

data = {'Name': ['Jon','Maria','Bill'],
        'Age': [25,47,38],
        'Birth Year': [1995,1973,1982],
        'Graduation Year': [2016,2000,2005]
        }

df = pd.DataFrame(data, columns = ['Name','Age','Birth Year','Graduation Year'])

my_array = df.to_numpy()

print(my_array)
print(type(my_array))
print(my_array.dtype)

As you can see, the dtype in this case is object:

Pandas DataFrame to NumPy Array

You can read more about df.to_numpy() by visiting the Pandas Documentation.