In this short guide, you’ll see how to get from Pandas DataFrame to SQL.
Steps
Step 1: Create a DataFrame
To start, create a DataFrame with the following data about products:
import pandas as pd
data = {
"product_name": ["Computer", "Tablet", "Monitor", "Printer"],
"price": [900, 300, 450, 150],
}
df = pd.DataFrame(data)
print(df)
Run the code, and you’ll get the following DataFrame:
product_name price
0 Computer 900
1 Tablet 300
2 Monitor 450
3 Printer 150
Step 2: Create a Database and a Table
For demonstration purposes, create a simple database and a table using sqlite3, where:
- The database name will be: test_database
- The table name will be: products
The “products” table will be used to store the information from the DataFrame.
Here is the full Python script to create the test_database and the products table:
import sqlite3
conn = sqlite3.connect("test_database")
c = conn.cursor()
c.execute("CREATE TABLE IF NOT EXISTS products (product_name text, price number)")
conn.commit()
Step 3: Get from Pandas DataFrame to SQL
You can use the following syntax to get from Pandas DataFrame to SQL:
df.to_sql("products", conn, if_exists="replace", index=False)
Here is the full Python code to get from Pandas DataFrame to SQL:
import pandas as pd
import sqlite3
conn = sqlite3.connect("test_database")
c = conn.cursor()
c.execute("CREATE TABLE IF NOT EXISTS products (product_name text, price number)")
conn.commit()
data = {
"product_name": ["Computer", "Tablet", "Monitor", "Printer"],
"price": [900, 300, 450, 150],
}
df = pd.DataFrame(data)
df.to_sql("products", conn, if_exists="replace", index=False)
c.execute(
"""
SELECT * FROM products
"""
)
for row in c.fetchall():
print(row)
Run the code, and you’ll get the following results:
('Computer', 900)
('Tablet', 300)
('Monitor', 450)
('Printer', 150)
Going from the DataFrame to SQL and then back to the DataFrame
For this example, you’ll create a new database called test_database_2, and then find the maximum price among all the products:
import pandas as pd
import sqlite3
conn = sqlite3.connect('test_database_2')
c = conn.cursor()
c.execute('CREATE TABLE IF NOT EXISTS products (product_name text, price number)')
conn.commit()
data = {
"product_name": ["Computer", "Tablet", "Monitor", "Printer"],
"price": [900, 300, 450, 150],
}
df = pd.DataFrame(data)
df.to_sql("products", conn, if_exists="replace", index=False)
c.execute(
"""
SELECT * FROM products
WHERE price = (SELECT max(price) FROM products)
"""
)
df = pd.DataFrame(c.fetchall(), columns=["product_name", "price"])
print(df)
Once you run the code in Python, you’ll get the product with the maximum price:
product_name price
0 Computer 900