SQL to Pandas DataFrame

In this guide, you’ll see how to get from SQL to Pandas DataFrame.

Steps

Step 1: Create a database and table

To start, create a database in Python using the sqlite3 package, where:

  • The database name is: test_database
  • The database contains a single table called: products
  • The “products” table have 3 columns with the following information:
product_idproduct_nameprice
1Computer800
2Printer200
3Tablet300
4Desk450
5Chair150

Here is the complete script to create the database and table in Python:

import sqlite3

conn = sqlite3.connect("test_database")
c = conn.cursor()

c.execute(
"""
CREATE TABLE IF NOT EXISTS products
([product_id] INTEGER PRIMARY KEY, [product_name] TEXT, [price] INTEGER)
"
""
)

c.execute(
"""
INSERT INTO products (product_id, product_name, price)

VALUES
(1,'Computer',800),
(2,'Printer',200),
(3,'Tablet',300),
(4,'Desk',450),
(5,'Chair',150)
"
""
)

conn.commit()

Once you run the above script in Python, a new database called ‘test_database‘ would be created at the same location where you saved your Python script.

Step 2: Get from SQL to Pandas DataFrame

To get from SQL to Pandas DataFrame using pd.read_sql_query:

import sqlite3
import pandas as pd

conn = sqlite3.connect("test_database")

sql_query = pd.read_sql_query(
"""
SELECT
*
FROM products
"
"",
conn,
)

df = pd.DataFrame(sql_query, columns=["product_id", "product_name", "price"])

print(df)

When applying pd.read_sql_query, don’t forget to place the connection string variable at the end. In our case, the connection string variable is conn.

Once you run the script in Python, you’ll get the following DataFrame:

   product_id  product_name  price
0           1      Computer    800
1           2       Printer    200
2           3        Tablet    300
3           4          Desk    450
4           5         Chair    150

Alternatively, you may use the approach below to get from SQL to a DataFrame:

import sqlite3
import pandas as pd

conn = sqlite3.connect("test_database")
c = conn.cursor()

c.execute(
"""
SELECT
*
FROM products
"
""
)

df = pd.DataFrame(c.fetchall(), columns=["product_id", "product_name", "price"])

print(df)

You’ll now get the same DataFrame:

   product_id  product_name  price
0           1      Computer    800
1           2       Printer    200
2           3        Tablet    300
3           4          Desk    450
4           5         Chair    150

Step 3 (optional): Find the maximum value using Pandas

After you got your DataFrame, you may apply different computations.

For example, you can find the maximum price by adding the following syntax:

max_price = df["price"].max()

print(max_price)

Putting everything together:

import sqlite3
import pandas as pd

conn = sqlite3.connect("test_database")
c = conn.cursor()

c.execute(
"""
SELECT
*
FROM products
"
""
)

df = pd.DataFrame(c.fetchall(), columns=["product_id", "product_name", "price"])

max_price = df["price"].max()

print(max_price)

As you can see, the maximum price is 800:

800

Leave a Comment