How to Create a DataFrame in Julia (example included)

The following template can be used to create a DataFrame in Julia:

using DataFrames
df = DataFrame(column_1 = ["value_1", "value_2", "value_3", ...], 
               column_2 = ["value_1", "value_2", "value_3", ...],
               column_3 = ["value_1", "value_2", "value_3", ...],
               ...
               )

In the next section, you’ll see the steps to create a DataFrame in Julia from Scratch.

Steps to Create a DataFrame in Julia from Scratch

Step 1: Install the DataFrames package

If you haven’t already done so, install the DataFrames package in Julia:

using Pkg
Pkg.add("DataFrames")

Step 2: Create a DataFrame in Julia

You can then use the following template to create a DataFrame in Julia:

using DataFrames
df = DataFrame(column_1 = ["value_1", "value_2", "value_3", ...], 
               column_2 = ["value_1", "value_2", "value_3", ...],
               column_3 = ["value_1", "value_2", "value_3", ...],
               ...
               )

For example, let’s say that you have the following data, and your goal is to create a DataFrame based on that data:

product_id product_name price
1 Oven 800
2 Microwave 250
3 Dishwasher 700
4 Refrigerator 1400
5 Toaster 120

Therefore, the complete code to create the DataFrame in Julia is as follows:

using DataFrames
df = DataFrame(product_id = [1, 2, 3, 4, 5], 
               product_name = ["Oven", "Microwave", "Dishwasher", "Refrigerator", "Toaster"],
               price = [800, 250, 700, 1400, 120]
               )
print(df)

Note that there is no need to use quotes around numeric values, unless you wish to capture those values as strings.

Step 3: Run the code in Julia

Run the code, and you’ll get the following DataFrame:

product_id product_name price
1 Oven 800
2 Microwave 250
3 Dishwasher 700
4 Refrigerator 1400
5 Toaster 120

Calculate the maximum value using the DataFrames package

Once you got your DataFrame, you can start performing an assortment of operations and calculations.

For simplicity, let’s say that you want to derive the maximum price in the DataFrame.

You can then use the following code to derive the maximum value:

using DataFrames
df = DataFrame(product_id = [1, 2, 3, 4, 5], 
               product_name = ["Oven", "Microwave", "Dishwasher", "Refrigerator", "Toaster"],
               price = [800, 250, 700, 1400, 120]
               )

max_value = maximum(df.price)

print(max_value)

Run the code and you’ll get the maximum price of 1400.