SQL to Pandas DataFrame

In this guide, you’ll see how to get from SQL to Pandas DataFrame.

Steps

Step 1: Create a database and table

To start, create a database in Python using the sqlite3 package, where:

The database name is: test_database
The database contains a single table called: products
The “products” table have 3 columns with the following information:

product_id	product_name	price
1	Computer	800
2	Printer	200
3	Tablet	300
4	Desk	450
5	Chair	150

Here is the complete script to create the database and table in Python:

import sqlite3

conn = sqlite3.connect("test_database")
c = conn.cursor()

c.execute(
    """
    CREATE TABLE IF NOT EXISTS products
    ([product_id] INTEGER PRIMARY KEY, [product_name] TEXT, [price] INTEGER)
    """
)

c.execute(
    """
    INSERT INTO products (product_id, product_name, price)

    VALUES
    (1,'Computer',800),
    (2,'Printer',200),
    (3,'Tablet',300),
    (4,'Desk',450),
    (5,'Chair',150)
    """
)

conn.commit()

Once you run the above script in Python, a new database called ‘test_database‘ would be created at the same location where you saved your Python script.

Step 2: Get from SQL to Pandas DataFrame

To get from SQL to Pandas DataFrame using pd.read_sql_query:

import sqlite3
import pandas as pd

conn = sqlite3.connect("test_database")

sql_query = pd.read_sql_query(
    """
    SELECT
    *
    FROM products
    """,
    conn,
)

df = pd.DataFrame(sql_query, columns=["product_id", "product_name", "price"])

print(df)

When applying pd.read_sql_query, don’t forget to place the connection string variable at the end. In our case, the connection string variable is conn.

Once you run the script in Python, you’ll get the following DataFrame:

   product_id  product_name  price
0           1      Computer    800
1           2       Printer    200
2           3        Tablet    300
3           4          Desk    450
4           5         Chair    150

Alternatively, you may use the approach below to get from SQL to a DataFrame:

import sqlite3
import pandas as pd

conn = sqlite3.connect("test_database")
c = conn.cursor()

c.execute(
    """
    SELECT
    *
    FROM products
    """
)

df = pd.DataFrame(c.fetchall(), columns=["product_id", "product_name", "price"])

print(df)

You’ll now get the same DataFrame:

   product_id  product_name  price
0           1      Computer    800
1           2       Printer    200
2           3        Tablet    300
3           4          Desk    450
4           5         Chair    150

Step 3 (optional): Find the maximum value using Pandas

After you got your DataFrame, you may apply different computations.

For example, you can find the maximum price by adding the following syntax:

max_price = df["price"].max()

print(max_price)

Putting everything together:

import sqlite3
import pandas as pd

conn = sqlite3.connect("test_database")
c = conn.cursor()

c.execute(
    """
    SELECT
    *
    FROM products
    """
)

df = pd.DataFrame(c.fetchall(), columns=["product_id", "product_name", "price"])

max_price = df["price"].max()

print(max_price)

As you can see, the maximum price is 800:

Steps

Step 1: Create a database and table

Step 2: Get from SQL to Pandas DataFrame

Step 3 (optional): Find the maximum value using Pandas

Leave a Comment Cancel reply