In this guide, you’ll see how to get from SQL to Pandas DataFrame.
Steps
Step 1: Create a database and table
To start, create a database in Python using the sqlite3 package, where:
- The database name is: test_database
- The database contains a single table called: products
- The “products” table have 3 columns with the following information:
product_id | product_name | price |
1 | Computer | 800 |
2 | Printer | 200 |
3 | Tablet | 300 |
4 | Desk | 450 |
5 | Chair | 150 |
Here is the complete script to create the database and table in Python:
import sqlite3
conn = sqlite3.connect("test_database")
c = conn.cursor()
c.execute(
"""
CREATE TABLE IF NOT EXISTS products
([product_id] INTEGER PRIMARY KEY, [product_name] TEXT, [price] INTEGER)
"""
)
c.execute(
"""
INSERT INTO products (product_id, product_name, price)
VALUES
(1,'Computer',800),
(2,'Printer',200),
(3,'Tablet',300),
(4,'Desk',450),
(5,'Chair',150)
"""
)
conn.commit()
Once you run the above script in Python, a new database called ‘test_database‘ would be created at the same location where you saved your Python script.
Step 2: Get from SQL to Pandas DataFrame
To get from SQL to Pandas DataFrame using pd.read_sql_query:
import sqlite3
import pandas as pd
conn = sqlite3.connect("test_database")
sql_query = pd.read_sql_query(
"""
SELECT
*
FROM products
""",
conn,
)
df = pd.DataFrame(sql_query, columns=["product_id", "product_name", "price"])
print(df)
When applying pd.read_sql_query, don’t forget to place the connection string variable at the end. In our case, the connection string variable is conn.
Once you run the script in Python, you’ll get the following DataFrame:
product_id product_name price
0 1 Computer 800
1 2 Printer 200
2 3 Tablet 300
3 4 Desk 450
4 5 Chair 150
Alternatively, you may use the approach below to get from SQL to a DataFrame:
import sqlite3
import pandas as pd
conn = sqlite3.connect("test_database")
c = conn.cursor()
c.execute(
"""
SELECT
*
FROM products
"""
)
df = pd.DataFrame(c.fetchall(), columns=["product_id", "product_name", "price"])
print(df)
You’ll now get the same DataFrame:
product_id product_name price
0 1 Computer 800
1 2 Printer 200
2 3 Tablet 300
3 4 Desk 450
4 5 Chair 150
Step 3 (optional): Find the maximum value using Pandas
After you got your DataFrame, you may apply different computations.
For example, you can find the maximum price by adding the following syntax:
max_price = df["price"].max()
print(max_price)
Putting everything together:
import sqlite3
import pandas as pd
conn = sqlite3.connect("test_database")
c = conn.cursor()
c.execute(
"""
SELECT
*
FROM products
"""
)
df = pd.DataFrame(c.fetchall(), columns=["product_id", "product_name", "price"])
max_price = df["price"].max()
print(max_price)
As you can see, the maximum price is 800:
800