How to Create DataFrame in R (with Examples)

Looking to create DataFrame in R?

If so, I’ll show you the steps to create DataFrame in R using a simple example.

Generally speaking, you may apply the following structure in R to create a DataFrame:

 

first_field <- c('Value_1', 'Value_2', ...)
second_field <- c('Value_1', 'Value_2', ...)

df <- data.frame(first_field, second_field )

 

Alternatively, you may use this structure to get the same DataFrame:

 

df <- data.frame (first_field  = c('Value_1', 'Value_2', ...),
                  second_field = c('Value_1', 'Value_2', ...)
                  )

 

We’ll see how to apply those structures in the next section.

Create DataFrame in R

Let’s start with a simple example, where the data-set is:

nameage
Jon23
Bill41
Maria32

The goal is to capture that data in R using a DataFrame.

Using the first structure that we saw at the beginning of this post, the DataFrame should look as follows:

 

name <- c('Jon', 'Bill', 'Maria')
age <- c(23, 41, 32)

df <- data.frame(name, age)

print (df)

 

Notice that it’s necessary to wrap text with quotations (as we did for the values in the name field), but it’s not required to use quotations for numerical values (as in the case for the values in the age field).

Once you run the above code in R, you’ll get:

How to Create DataFrame in R (with Examples)

The values in R match with those in our data-set.

You can achieve the same outcome by using the second structure (don’t forget to place a closing bracket at the end of your DataFrame – as captured in the third line of the code below):

 

df <- data.frame(name = c('Jon', 'Bill', 'Maria'),
                 age = c(23, 41, 32)
                 )
print (df)

 

Run the above code in R, and you should get the same results:

Create DataFrame in R

Note, that you can also create a DataFrame by importing the data into R.

For example, if you stored the original data in a CSV file, you can simply import that data into R, and then assign it to a DataFrame.

In my case, I stored the CSV file on my desktop, under the following path:

C:\\Users\\Doron E\\Desktop\\MyData.csv

  • The file name (as highlighted in blue) is: ‘MyData’
    You may pick a different file name based on your needs
  • While the file extension (as highlighted in green) is: ‘.csv’
    You have to add the ‘.csv’ extension when importing csv files into R
  • Finally, use double backslash (‘\\’) within the path name to avoid any errors in R

Putting everything together, this how the code would look like in R (you’ll need to change the path name to the location where the CSV file is stored on your computer):

 

mydata <- read.csv('C:\\Users\\Doron E\\Desktop\\MyData.csv', header = TRUE)

df <- data.frame(mydata)

print (df)

 

After you created the DataFrame in R, using either of the above methods, you can then apply some statistical analysis.

In the next, and final section, I’ll show you how to apply basic stats in R.

Applying Basic Stats in R

Once you created the DataFrame, you can apply different computations and statistical analysis to your data.

For example, to find the maximum age in our data-set, you can apply the following code in R:

 

name <- c('Jon', 'Bill', 'Maria')
age <- c(23, 41, 32)

df <- data.frame(name, age)

print (max(df$age))

 

If your run the code in R, you’ll get the maximum age of 41.

Similarly, you can easily compute the mean age by applying:

 

name <- c('Jon', 'Bill', 'Maria')
age <- c(23, 41, 32)

df <- data.frame(name, age)

print (mean(df$age))

 

And once you run the code, you’ll get the mean age of 32.

Those are just 2 examples, but once you create DataFrame in R, you may apply an assortment of computations and statistical analysis to your data.

You can get more info about applying the DataFrame in R by reviewing the R documentation.

 

CategoriesR