Check the Data Type of each DataFrame Column in R

Here are 2 ways to check the data type of each DataFrame column in R:

(1) Using str():

str(df)

(2) Using sapply() and class():

sapply(df, class)

Next, you’ll see a simple example with the steps to:

  • Create a DataFrame in R; and
  • Check the data type of each column in the DataFrame

Steps to Check the Data Type of each DataFrame Column in R

Step 1: Create a DataFrame

To begin, create your DataFrame in R.

For example, let’s create the following DataFrame about 5 individuals:

df <- data.frame(name = c("Jon", "Bill", "Maria", "Ben", "Emma"),
                 age = c(23, 41, 32, 57, 38),
                 date_of_birth = as.Date(c("1997-05-21", "1979-03-15", "1988-11-08", "1963-02-23", "1982-09-12")),
                 employed = c(TRUE, FALSE, TRUE, TRUE, FALSE)
                 )

print(df)

This is how the DataFrame would look like once you run the code in R:

   name  age  date_of_birth  employed
1   Jon   23     1997-05-21      TRUE
2  Bill   41     1979-03-15     FALSE
3 Maria   32     1988-11-08      TRUE
4   Ben   57     1963-02-23      TRUE
5  Emma   38     1982-09-12     FALSE

Step 2: Check the Data Type of each Column

You may use str() in order to check the data type of each column in the DataFrame in R:

df <- data.frame(name = c("Jon", "Bill", "Maria", "Ben", "Emma"),
                 age = c(23, 41, 32, 57, 38),
                 date_of_birth = as.Date(c("1997-05-21", "1979-03-15", "1988-11-08", "1963-02-23", "1982-09-12")),
                 employed = c(TRUE, FALSE, TRUE, TRUE, FALSE)
                 )

str(df)

You’ll now see the data type that corresponds to each column in the DataFrame:

'data.frame':   5 obs. of  4 variables:
 $ name: chr  "Jon" "Bill" "Maria" "Ben" ...
 $ age: num  23 41 32 57 38
 $ date_of_birth: Date, format: "1997-05-21" "1979-03-15" ...
 $ employed: logi  TRUE FALSE TRUE TRUE FALSE

Alternatively, you may use sapply() and class() to get the data type of each column in the DataFrame

df <- data.frame(name = c("Jon", "Bill", "Maria", "Ben", "Emma"),
                 age = c(23, 41, 32, 57, 38),
                 date_of_birth = as.Date(c("1997-05-21", "1979-03-15", "1988-11-08", "1963-02-23", "1982-09-12")),
                 employed = c(TRUE, FALSE, TRUE, TRUE, FALSE)
                 )

sapply(df, class)

The result:

     name            age         date_of_birth        employed 
  "character"     "numeric"         "Date"           "logical" 

Leave a Comment