How to Create a DataFrame in R (with Examples)

Generally speaking, you may use the following template in order to create a DataFrame in R:

first_column <- c("value_1", "value_2", "value_3")
second_column <- c("value_A", "value_B", "value_C")

df <- data.frame(first_column, second_column)

print(df)

Alternatively, you may apply this structure to get the same DataFrame:

df <- data.frame(first_column = c("value_1", "value_2", "value_3"),
                 second_column = c("value_A", "value_B", "value_C")
                 )

print(df)

Next, you’ll see how to apply each of the above templates in practice.

Create a DataFrame in R

Let’s start with a simple example, where the dataset is:

NameAge
Jon23
Bill41
Maria32
Ben58
Tina26

The goal is to capture that data in R using a DataFrame.

Using the first template, the DataFrame would look like this:

Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)

df <- data.frame(Name, Age)

print(df)

Note: that it’s necessary to place quotes around text (for the values under the Name column), but not around numeric values (for the values under the Age column).

Once you run the above code in R, you’ll get this simple DataFrame:

   Name  Age
1   Jon   23
2  Bill   41
3 Maria   32
4   Ben   58
5  Tina   26

The values in R match with those in our dataset.

You can achieve the same outcome using the second template (don’t forget to place a closing bracket at the end of your DataFrame – as captured in the third line of the code below):

df <- data.frame(Name = c("Jon", "Bill", "Maria", "Ben", "Tina"),
                 Age = c(23, 41, 32, 58, 26)
                 )
print(df)

Run the above code, and you’ll get the same results:

   Name  Age
1   Jon   23
2  Bill   41
3 Maria   32
4   Ben   58
5  Tina   26

Note, that you can also create a DataFrame by importing the data into R.

For example, if you stored the original data in a CSV file, you can simply import that data into R, and then assign it to a DataFrame.

For demonstration purposes, let’s assume that a CSV file is stored under the following path:

C:\\Users\\Ron\\Desktop\\Test\\MyData.csv

Where:

  • The file name (as highlighted in blue) is: MyData
  • The file extension (as highlighted in green) is: .csv. You must add this extension when importing csv files into R
  • Double backslash (‘\\’) is used within the path to avoid any errors in R

Here is the full code (you’ll need to change the path to reflect the location where the CSV file is stored on your computer):

df <- read.csv("C:\\Users\\Ron\\Desktop\\Test\\MyData.csv")

print(df)

In the final section below, you’ll see how to apply some basic stats in R.

Applying Basic Stats in R

Once you created the DataFrame, you may apply different computations and statistical analysis.

For instance, to find the maximum age in our data, simply apply the following code:

Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)

df <- data.frame(Name, Age)

print(max(df$Age))

If your run the code, you’ll get the maximum age of 58.

Similarly, you can compute the mean age by applying:

Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)

df <- data.frame(Name, Age)

print(mean(df$Age))

Once you run the code, you’ll get the mean age of 36.

You can find additional info about creating a DataFrame in R by reviewing the R documentation.

Leave a Comment