In this guide, you’ll see how to get from Pandas DataFrame to SQL.
Here are the steps that you may follow.
Steps to get from Pandas DataFrame to SQL
Step 1: Create a DataFrame
To start, let’s create a DataFrame based on the following data about products:
product_name | price |
Computer | 900 |
Tablet | 300 |
Monitor | 450 |
Printer | 150 |
Here is the code to create the DataFrame in Python:
import pandas as pd data = {'product_name': ['Computer','Tablet','Monitor','Printer'], 'price': [900,300,450,150] } df = pd.DataFrame(data, columns= ['product_name','price']) print (df)
Once you run the code, you’ll get the following DataFrame:
product_name price
0 Computer 900
1 Tablet 300
2 Monitor 450
3 Printer 150
Step 2: Create a Database
For demonstration purposes, let’s create a simple database using sqlite3.
To start, you’ll need to import the sqlite3 package:
import sqlite3
Next, create the database. For example, create a database called: ‘test_database‘
conn = sqlite3.connect('test_database') c = conn.cursor()
Finally, create the ‘products‘ table:
c.execute('CREATE TABLE IF NOT EXISTS products (product_name text, price number)') conn.commit()
The ‘products’ table will be used to store the information from the DataFrame.
Step 3: Get from Pandas DataFrame to SQL
You can use the following syntax to get from Pandas DataFrame to SQL:
df.to_sql('products', conn, if_exists='replace', index = False)
Where ‘products’ is the table name created in step 2.
Here is the full Python code to get from Pandas DataFrame to SQL:
import pandas as pd import sqlite3 conn = sqlite3.connect('test_database') c = conn.cursor() c.execute('CREATE TABLE IF NOT EXISTS products (product_name text, price number)') conn.commit() data = {'product_name': ['Computer','Tablet','Monitor','Printer'], 'price': [900,300,450,150] } df = pd.DataFrame(data, columns= ['product_name','price']) df.to_sql('products', conn, if_exists='replace', index = False) c.execute(''' SELECT * FROM products ''') for row in c.fetchall(): print (row)
Run the code, and you’ll get the following results:
('Computer', 900)
('Tablet', 300)
('Monitor', 450)
('Printer', 150)
Going from the DataFrame to SQL and then back to the DataFrame
Now let’s see how to go from the DataFrame to SQL, and then back to the DataFrame.
For this example, you can create a new database called: ‘test_database_2‘
conn = sqlite3.connect('test_database_2') c = conn.cursor()
Then, create the same products table using this syntax:
c.execute('CREATE TABLE IF NOT EXISTS products (product_name text, price number)') conn.commit()
Now, build the DataFrame:
data = {'product_name': ['Computer','Tablet','Monitor','Printer'], 'price': [900,300,450,150] } df = pd.DataFrame(data, columns= ['product_name','price'])
Apply the code to go from the DataFrame to SQL:
df.to_sql('products', conn, if_exists='replace', index = False)
You can then find the maximum price among all the products using this query:
c.execute(''' SELECT * FROM products WHERE price = (SELECT max(price) FROM products) ''')
Finally, get back from SQL to the DataFrame:
df = pd.DataFrame(c.fetchall(), columns=['product_name','price']) print (df)
Putting all the code components together:
import pandas as pd import sqlite3 conn = sqlite3.connect('test_database_2') c = conn.cursor() c.execute('CREATE TABLE IF NOT EXISTS products (product_name text, price number)') conn.commit() data = {'product_name': ['Computer','Tablet','Monitor','Printer'], 'price': [900,300,450,150] } df = pd.DataFrame(data, columns= ['product_name','price']) df.to_sql('products', conn, if_exists='replace', index = False) c.execute(''' SELECT * FROM products WHERE price = (SELECT max(price) FROM products) ''') df = pd.DataFrame(c.fetchall(), columns=['product_name','price']) print (df)
Once you run the code in Python, you’ll get the product with the maximum price:
product_name price
0 Computer 900