How to Create a DataFrame in Julia (example included)

In this short guide, I’ll show you how to create a DataFrame in Julia.

To start, here is a template that you can use to create your DataFrame:

using DataFrames
df = DataFrame(Column1 = ["Value1","Value2","Value3",...], 
               Column2 = ["Value1","Value2","Value3",...],
               Column3 = ["Value1","Value2","Value3",...,
               ...
               )

In the next section, I’ll review the steps to create a DataFrame in Julia from Scratch.

Steps to Create a DataFrame in Julia from Scratch

Step 1: Install the DataFrames package

To install the DataFrames package, you’ll need to open the Julia command-line:

Julia

You’ll then see this screen:

command line

Type the following code in the command-line, and then press ENTER:

using Pkg

using Pkg - Julia

Finally, to complete the installation of the DataFrames package, type the code below, and then press ENTER:

Pkg.add("DataFrames")

Create a DataFrame in Julia

You’ll need to wait about a minute for the installation to complete.

Step 2: Create a DataFrame in Julia

You can use the following template to create a DataFrame in Julia:

using DataFrames
df = DataFrame(Column1 = ["Value1","Value2","Value3",...], 
               Column2 = ["Value1","Value2","Value3",...],
               Column3 = ["Value1","Value2","Value3",...,
               ...
               )

For example, let’s say that you collected the data below, and your goal is to create a DataFrame based on that data.

Name Age Salary
Jon2230000
Bill4345000
Maria8160000
Julia5250000
Mark2755000

This is the complete code to create the DataFrame in Julia (note that there is no need to use quotes around numeric values, unless you wish to capture those values as strings):

using DataFrames
df = DataFrame(Name = ["Jon","Bill","Maria","Julia","Mark"], 
               Age = [22,43,81,52,27],
               Salary = [30000,45000,60000,50000,55000]
               )

Copy the code into Julia:

DataFrame in Julia

Once you’re ready, run the code and you’ll get the following DataFrame:

Create a DataFrame in Julia

Alternatively, you could run the code in Jupyter Notebook. You will get the same DataFrame:

Julia DataFrame in Jupyter Notebook

Calculate the maximum value using the DataFrames package

Once you got your DataFrame, you can start performing an assortment of operations and calculations.

For simplicity, let’s say that you want to derive the maximum salary in the DataFrame.

You can then use the following code to derive the maximum value:

using DataFrames

df = DataFrame(Name = ["Jon","Bill","Maria","Julia","Mark"], 
               Age = [22,43,81,52,27],
               Salary = [30000,45000,60000,50000,55000]
               )

max_value = maximum(df.Salary)

print(max_value)

Run the code and you’ll get the maximum salary of 60000:

Max value