How to Create a Database in Python using sqlite3

Need to create a database in Python? If so, I’ll show you an example with the steps to create a database in Python using sqlite3.

But before we begin, here is a template that you can use to create a database in Python using sqlite3:

import sqlite3
sqlite3.connect('Type your DataBase name here.db')

Steps to Create a Database in Python using sqlite3

Step 1: Create the Database and Tables

In this step, you’ll see how to create:

  • A new database called: TestDB
  • 3 tables called: CLIENTS, COUNTRY, and DAILY_STATUS

Here are the fields to be added for each of the 3 tables:

Table NameField NameField Format
CLIENTSgenerated_idInteger – Primary Key
CLIENTSClient_NameText
CLIENTSCountry_IDInteger
CLIENTSDateDate
COUNTRYgenerated_idInteger – Primary Key
COUNTRYCountry_IDInteger
COUNTRYCountry_NameText
DAILY_STATUSClient_NameText
DAILY_STATUSCountry_NameText
DAILY_STATUSDateDate

Below is the code that you can use in order to create the database and 3 tables using sqlite3. Additional comments are available within the code itself to provide further insights.

import sqlite3

conn = sqlite3.connect('TestDB.db')  # You can create a new database by changing the name within the quotes
c = conn.cursor() # The database will be saved in the location where your 'py' file is saved

# Create table - CLIENTS
c.execute('''CREATE TABLE CLIENTS
             ([generated_id] INTEGER PRIMARY KEY,[Client_Name] text, [Country_ID] integer, [Date] date)''')
          
# Create table - COUNTRY
c.execute('''CREATE TABLE COUNTRY
             ([generated_id] INTEGER PRIMARY KEY,[Country_ID] integer, [Country_Name] text)''')
        
# Create table - DAILY_STATUS
c.execute('''CREATE TABLE DAILY_STATUS
             ([Client_Name] text, [Country_Name] text, [Date] date)''')
                 
conn.commit()

# Note that the syntax to create new tables should only be used once in the code (unless you dropped the table/s at the end of the code). 
# The [generated_id] column is used to set an auto-increment ID for each record
# When creating a new table, you can add both the field names as well as the field formats (e.g., Text)

Once you run the above code in Python, a new file, called TestDB, will get created at the same location where you saved your Python script.

Step 2: Import the Data using Pandas

For this step, let’s assume that you have 2 CSV files that you’d like to import into Python:

(1) The first CSV file is called Client_14-JAN-2019 and has the following client data (on 14-Jan-2019):

Client_NameCountry_IDDate
Jon Smith114012019
Bill Martin214012019
Maria Blue314012019
Rita Yu414012019
Jack Mo514012019

(2) The second CSV file is called Country_14-JAN-2019 and has this country data (on 14-Jan-2019):

Country_IDCountry_Name
1Japan
2US
3Canada
4Brazil
5UK
6Spain
7China
8Italy
9Peru
10Russia

The imported client data will be stored in the CLIENTS table. While the imported country data will be stored in the COUNTRY table. You’ll then be able to link the data from the CLIENTS table to the data from the COUNTRY table via the Country_ID field.

The ultimate goal is to display the daily status (of the client and country names for a given date) after connecting the CLIENTS and COUNTRY tables together.

To accomplish this goal, you’ll also need to install the pandas package. The pandas package is an essential component that will be used to:

  • Import the CSV files using the read_csv command
  • Assign the values imported from the CSV files into the tables using the to_sql command
  • Assign the SQL fields into the DataFrame
  • Export the final results into a CSV file using the to_csv command

The 2 files to be imported into Python are currently stored on my machine under the following paths:

C:\Users\Ron\Desktop\Client\Client_14-JAN-2019.csv
C:\Users\Ron\Desktop\Client\Country_14-JAN-2019.csv

You’ll need to adjust the Python code below to the paths where your CSV files are stored on your machine:

import sqlite3
import pandas as pd
from pandas import DataFrame

conn = sqlite3.connect('TestDB.db')  
c = conn.cursor()

read_clients = pd.read_csv (r'C:\Users\Ron\Desktop\Client\Client_14-JAN-2019.csv')
read_clients.to_sql('CLIENTS', conn, if_exists='append', index = False) # Insert the values from the csv file into the table 'CLIENTS' 

read_country = pd.read_csv (r'C:\Users\Ron\Desktop\Client\Country_14-JAN-2019.csv')
read_country.to_sql('COUNTRY', conn, if_exists='replace', index = False) # Replace the values from the csv file into the table 'COUNTRY'

# When reading the csv:
# - Place 'r' before the path string to read any special characters, such as '\'
# - Don't forget to put the file name at the end of the path + '.csv'
# - Before running the code, make sure that the column names in the CSV files match with the column names in the tables created and in the query below
# - If needed make sure that all the columns are in a TEXT format

c.execute('''
INSERT INTO DAILY_STATUS (Client_Name,Country_Name,Date)
SELECT DISTINCT clt.Client_Name, ctr.Country_Name, clt.Date
FROM CLIENTS clt
LEFT JOIN COUNTRY ctr ON clt.Country_ID = ctr.Country_ID
          ''')

c.execute('''
SELECT DISTINCT *
FROM DAILY_STATUS
WHERE Date = (SELECT max(Date) FROM DAILY_STATUS)
          ''')
   
#print(c.fetchall())

df = DataFrame(c.fetchall(), columns=['Client_Name','Country_Name','Date'])
print (df) # To display the results after an insert query, you'll need to add this type of syntax above: 'c.execute(''' SELECT * from latest table ''')

df.to_sql('DAILY_STATUS', conn, if_exists='append', index = False) # Insert the values from the INSERT QUERY into the table 'DAILY_STATUS'

# export_csv = df.to_csv (r'C:\Users\Ron\Desktop\Client\export_list.csv', index = None, header=True) # Uncomment this syntax if you wish to export the results to CSV. Make sure to adjust the path name
# Don't forget to add '.csv' at the end of the path (as well as r at the beg to address special characters)

When you run the code in Python, you’ll get the results that reflect the daily status on 14-Jan-2019:

How to Create a Database in Python using sqlite3

In the code above, you’ll notice that we used an INSERT INTO statement to store the linked data (i.e., linked client and country data) into the DAILY_STATUS table.

We also used a WHERE condition to display the records that correspond to the maximum date. These results will constitute our daily status.

Step 3: Run the code for a Subsequent Date

Now imagine that you have 2 new CSV files for the subsequent date of 15-Jan-2019:

(1) The first CSV file is called Client_15-JAN-2019 and has the following client data (on 15-Jan-2019):

Client_NameCountry_IDDate
Ron Green315012019
Jeff Long815012019
Carrie Lan315012019
Marry Sig515012019
Ben Baker915012019

(2) The second CSV file is called Country_15-JAN-2019 and has this country data (on 15-Jan-2019):

Country_IDCountry_Name
1Japan
2US
3Canada
4Brazil
5UK
6Spain
7China
8Italy
9Peru
10Russia
11Mexico
12Germany

Below is the code that you’ll need to run for the second date.

And as before, you’ll need to adjust the paths to the location where your CSV files are stored on your machine. Make sure that you are using the files for 15-JAN_2019:

import sqlite3
import pandas as pd
from pandas import DataFrame

conn = sqlite3.connect('TestDB.db')
c = conn.cursor()

read_clients = pd.read_csv (r'C:\Users\Ron\Desktop\Client\Client_15-JAN-2019.csv')   
read_clients.to_sql('CLIENTS', conn, if_exists='append', index = False)

read_country = pd.read_csv (r'C:\Users\Ron\Desktop\Client\Country_15-JAN-2019.csv')
read_country.to_sql('COUNTRY', conn, if_exists='replace', index = False)

c.execute('''
INSERT INTO DAILY_STATUS (Client_Name,Country_Name,Date) 
SELECT DISTINCT clt.Client_Name, ctr.Country_Name, clt.Date
FROM CLIENTS clt
LEFT JOIN COUNTRY ctr ON clt.Country_ID = ctr.Country_ID
          ''')

c.execute('''
SELECT DISTINCT *
FROM DAILY_STATUS
WHERE Date = (SELECT max(Date) FROM DAILY_STATUS)
          ''')

df = DataFrame(c.fetchall(), columns=['Client_Name','Country_Name','Date'])
print (df)

df.to_sql('DAILY_STATUS', conn, if_exists='append', index = False)

# export_csv = df.to_csv (r'C:\Users\Ron\Desktop\Client\export_list.csv', index = None, header=True)

Once you run the above code in Python, you’ll get the results that reflect the daily status on 15-Jan-2019:

Create a Database in Python using sqlite3

Addressing Common Errors

After you applied the code to create the tables (see step 1), do not include the syntax to create the tables again. Otherwise, you’ll get the following error:

“OperationalError: table CLIENTS already exists.”

In that case, you may want to start over by creating a new database (e.g., by changing the database name from ‘TestDB.db’ to ‘TestDB1.db’) and then rerunning all the codes from the start.

Also make sure that no formatting is applied when copying the data into the CSV files. You may then save the data in the CSV files using a TEXT format, just in case you run into the following error:

“OperationalError: table CLIENTS has no column named Unnamed: 3.”

Additional Sources

You just saw how to create a database in Python using the sqlite3 package. You may want to check the following articles to learn more about connecting Python with different database applications: