How to Create a Database in Python using sqlite3

Need to create a database in Python? If so, I’ll show you an example with the steps to create a database in Python using sqlite3.

But before we begin, here is the generic syntax that you can use to create a database in Python using sqlite3:

sqlite3.connect('Type your DataBase name here.db')

Example of creating a database in Python using sqlite3

In the following example we will:

(1) Create a new database called ‘TestDB’

(2) Create 4 tables: CLIENTS,  COUNTRY,  ARCHIVE,  and DAILY_STATUS

(3) Import 2 CSV files for the first date (18-FEB-2018):
Client_18-FEB_2018, and Country_18-FEB_2018

(4) Import 2 CSV files for the subsequent date (19-FEB-2018):
Client_19-FEB_2018, and Country_19-FEB_2018

(5) The imported client data from the CSV files will be stored in the CLIENTS table; While, the imported country data will be stored in the COUNTRY table

(6) Write SQL query to link the data from the CLIENTS table with the data from the COUNTRY table

(7) Use an INSERT INTO statement to store the linked data (i.e., linked clients and country data) into the ARCHIVE table

(8) Use an INSERT INTO statement to pass the linked data into the DAILY_STATUS table

(9) Utilize a WHERE condition to display the records that correspond to the maximum date. These results will constitute our daily status

In addition to sqlite3, you’ll need to install the pandas package. The pandas package is an essential component that will be used to:

  1. Import the CSV files using the “read_csv” command
  2. Assign the values imported from the CSV files into the tables, so that we can then execute the SQL queries. For that, we will use the “to_sql” command
  3. Assign the SQL fields into the dataframe
  4. Export the final results into a CSV file using the “to_csv” command

In the next section, I’m going to review the data-set to be used for our example.

The Dataset to be used within the CSV files

As indicated before, we will have 4 CSV files with the following information:

DATE-1 (i.e., 18-FEB-2018):

(1) Client_18-FEB_2018:

Client_NameCountry_IDDate
Jon Smith118022018
Bill Martin218022018
Maria Blue318022018
Rita Yu418022018
Jack Mo518022018

 

(2) Country_18-FEB_2018:

Country_IDCountry_Name
1Japan
2US
3Canada
4Brazil
5UK
6Spain
7China
8Italy
9Peru
10Russia

 

DATE-2 (i.e., 19-FEB-2018):

(3) Client_19-FEB_2018:

Client_NameCountry_IDDate
Ron Green319022018
Jeff Long819022018
Carrie Lan319022018
Marry Sig519022018
Ben Baker919022018

 

(4) Country_19-FEB_2018:

Country_IDCountry_Name
1Japan
2US
3Canada
4Brazil
5UK
6Spain
7China
8Italy
9Peru
10Russia
11Mexico
12Germany

 

The above 4 files to be imported into Python are currently stored on my machine under the following paths:

C:\Users\Doron E\Desktop\Client\Client_18-FEB_2018.csv
C:\Users\Doron E\Desktop\Client\Country_18-FEB_2018.csv

C:\Users\Doron E\Desktop\Client\Client_19-FEB_2018.csv
C:\Users\Doron E\Desktop\Client\Country_19-FEB_2018.csv

 

And here is the path where the exported CSV file will be stored (where the file name is ‘export_list’):

C:\Users\Doron E\Desktop\Client\export_list.csv

 

You’ll need to adjust the Python code (see the next section below) to the paths where your CSV files are located on your machine.

Let’s now review the code to create a database in Python using sqlite3.

The code in Python to create a database using sqlite3

Note that you’ll need to run the code in Python twice:

  • Firstly to create all the tables (i.e., CLIENTS, COUNTRY, ARCHIVE, DAILY_STATUS), and import the 2 CSV files for the first date (i.e., 18-FEB-2018): Client_18-FEB_2018 and Country_18-FEB_2018. The results will then be displayed to reflect the daily status for the first date
  • Secondly to import the 2 CSV files for the second date (i.e., 19-FEB-2018): Client_19-FEB_2018 and Country_19-FEB_2018. The results will then be displayed to reflect the daily status for the second date. Please note that before you run the code for the second time, you’ll need to remove the syntax that is used to create the 4 tables (as you only need to create the tables once)

You may wish to save the data in the CSV files using a TEXT format, just in case you run into the following error:

“OperationalError: table CLIENTS has no column named Unnamed: 3.”

Let’s look at the code that you’ll need to run for the first date (as mentioned earlier, you’ll need to adjust the paths to the location where your CSV files are stored on your machine. Make sure that you are using the files for 18-FEB-2018):

 

import sqlite3
import pandas as pd
from pandas import DataFrame

conn = sqlite3.connect('TestDB.db')  # you can create a new database by changing the name within the quotations (the database will be saved in the location where your 'py' file is saved)

c = conn.cursor()


# Create table - CLIENTS
c.execute('''CREATE TABLE CLIENTS
             (generated_id INTEGER PRIMARY KEY,Client_Name text, Country_ID text, Date text)''') #this syntax should be executed only when you want to create a new table, but it should no be used again when running the code, after the table was created (unless you dropped the table at the end of the code). Notice that here you can create the field names as well as the field format - e.g., Text)
             # use '[]' if the field name has spaces. 'generated_id INTEGER PRIMARY KEY' is column created to set an auto-increment ID for each record
             

# Create table - COUNTRY
c.execute('''CREATE TABLE COUNTRY
             (generated_id INTEGER PRIMARY KEY,Country_ID text, Country_Name text)''') #this syntax should be executed only when you want to create a new table, but it should no be used again when running the code, after the table was created (unless you dropped the table at the end of the code). Notice that here you can create the field names as well as the field format - e.g., Text)
             # use '[]' if the field name has spaces. 'generated_id INTEGER PRIMARY KEY' is column created to set an auto-increment ID for each record
           

# Create table - ARCHIVE
c.execute('''CREATE TABLE ARCHIVE
             (Client_Name text, Country_Name text, Date text)''') #this syntax should be executed only when you want to create a new table, but it should no be used again when running the code, after the table was created (unless you dropped the table at the end of the code). Notice that here you can create the field names as well as the field format - e.g., Text)
             # use '[]' if the field name has spaces. 'generated_id INTEGER PRIMARY KEY' is column created to set an auto-increment ID for each record
           

# Create table - DAILY_STATUS
c.execute('''CREATE TABLE DAILY_STATUS
             (Client_Name text, Country_Name text, Date text)''') #this syntax should be executed everytime you run the code, as this table is dropped each time at the end of the code
             # use '[]' if the field name has spaces. 
           

# Save (commit) the changes
conn.commit()

# when reading the csv, if needed make sure that all the columns are in a TEXT format
read_clients = pd.read_csv (r'C:\Users\Doron E\Desktop\Client\Client_18-FEB_2018.csv')   #read the csv file (put 'r' before the path string to read any special characters, such as '\'). Don't forget the put the file name at the end of the path + '.csv'
read_clients.to_sql('CLIENTS', conn, if_exists='append', index = False) #insert the values from the csv file into the table 'CLIENTS' 


# when reading the csv, if needed make sure that all the columns are in a TEXT format
read_country = pd.read_csv (r'C:\Users\Doron E\Desktop\Client\Country_18-FEB_2018.csv')   #read the csv file (put 'r' before the path string to read any special characters, such as '\'). Don't forget the put the file name at the end of the path + '.csv'
read_country.to_sql('COUNTRY', conn, if_exists='replace', index = False) #replace the values from the csv file into the table 'COUNTRY'


# before running the code, make sure that the column names in the CSV files match with the column names in the tables created and the query below


c.execute('''
  
INSERT INTO ARCHIVE (Client_Name,Country_Name,Date)
       
SELECT DISTINCT

clt.Client_Name,
ctr.Country_Name,
clt.Date

FROM CLIENTS clt
LEFT JOIN COUNTRY ctr 
ON clt.Country_ID = ctr.Country_ID
 
          ''')


c.execute('''
  

INSERT INTO DAILY_STATUS (Client_Name,Country_Name,Date)

        
SELECT DISTINCT

Client_Name,
Country_Name,
Date

FROM ARCHIVE 
 
          ''')


c.execute('''


SELECT DISTINCT

*

FROM 
DAILY_STATUS

WHERE Date = (SELECT max(Date) FROM DAILY_STATUS)

          ''')

    
#print(c.fetchall())

df = DataFrame(c.fetchall(), columns=['Client_Name','Country_Name','Date'])    #convert the SQL fields to a dataframe 

print (df) # to display results after an insert query, you'll need to add above another: 'c.execute(''' SELECT * from latest table ''')

#df.to_sql('DAILY_STATUS', conn, if_exists='append', index = False) #insert the values from the INSERT QUERY into the table 'DAILY_STATUS'
df.to_sql('ARCHIVE', conn, if_exists='append', index = False) #insert the values from the INSERT QUERY into the table 'ARCHIVE'

export_csv = df.to_csv (r'C:\Users\Doron E\Desktop\Client\export_list.csv', index = None, header=True) # you may need to create an empty csv file in the destination folder (e.g.,export_list). Don't forget to add '.csv' at the end of the path (as well as r at the beg to address special characters)

 

When you run the above code in Python, you should get the following results:

 

How to Create a Database in Python using sqlite3

 

Below is the code that you’ll need to run for the second date. This code should NOT contain the syntax to create the 4 tables. Otherwise, you’ll get the following error:

“OperationalError: table CLIENTS already exists.”

In that case, you may want to start over by creating a new database (e.g., by changing the database name from ‘TestDB.db’ to ‘TestDB1.db’) and then rerunning the code, starting from the first date (and then removing the syntax to create the 4 tables, when running the code again for the second date).

And as before, you’ll need to adjust the paths to the location where your CSV files are stored on your machine. Make sure that you are using the files for 19-FEB-2018):

 

import sqlite3
import pandas as pd
from pandas import DataFrame

conn = sqlite3.connect('TestDB.db')  # you can create a new database by changing the name within the quotations (the database will be saved in the location where your 'py' file is saved)

c = conn.cursor()

# when reading the csv, if needed make sure that all the columns are in a TEXT format
read_clients = pd.read_csv (r'C:\Users\Doron E\Desktop\Client\Client_19-FEB_2018.csv')   #read the csv file (put 'r' before the path string to read any special characters, such as '\'). Don't forget the put the file name at the end of the path + '.csv'
read_clients.to_sql('CLIENTS', conn, if_exists='append', index = False) #insert the values from the csv file into the table 'CLIENTS' 

# when reading the csv, if needed make sure that all the columns are in a TEXT format
read_country = pd.read_csv (r'C:\Users\Doron E\Desktop\Client\Country_19-FEB_2018.csv')   #read the csv file (put 'r' before the path string to read any special characters, such as '\'). Don't forget the put the file name at the end of the path + '.csv'
read_country.to_sql('COUNTRY', conn, if_exists='replace', index = False) #replace the values from the csv file into the table 'COUNTRY'

# before running the code, make sure that the column names in the CSV files match with the column names in the tables created and the query below

c.execute('''
  
INSERT INTO ARCHIVE (Client_Name,Country_Name,Date)
     
SELECT DISTINCT

clt.Client_Name,
ctr.Country_Name,
clt.Date

FROM CLIENTS clt
LEFT JOIN COUNTRY ctr 
ON clt.Country_ID = ctr.Country_ID
 
          ''')


c.execute('''
  
INSERT INTO DAILY_STATUS (Client_Name,Country_Name,Date)
      
SELECT DISTINCT

Client_Name,
Country_Name,
Date

FROM ARCHIVE 

          ''')


c.execute('''

SELECT DISTINCT
*
FROM 
DAILY_STATUS

WHERE Date = (SELECT max(Date) FROM DAILY_STATUS)

          ''')

#print(c.fetchall())

df = DataFrame(c.fetchall(), columns=['Client_Name','Country_Name','Date'])    #convert the SQL fields to a dataframe 

print (df) # to display results after an insert query, you'll need to add above another: 'c.execute(''' SELECT * from latest table ''')

#df.to_sql('DAILY_STATUS', conn, if_exists='append', index = False) #insert the values from the INSERT QUERY into the table 'DAILY_STATUS'
df.to_sql('ARCHIVE', conn, if_exists='append', index = False) #insert the values from the INSERT QUERY into the table 'ARCHIVE'

export_csv = df.to_csv (r'C:\Users\Doron E\Desktop\Client\export_list.csv', index = None, header=True) # you may need to create an empty csv file in the destination folder (e.g.,export_list). Don't forget to add '.csv' at the end of the path (as well as r at the beg to address special characters)

 

After you run the above code in Python, you should get the following results:

 

Create a Database in Python using sqlite3

 

That’s it! If you want to learn more about connecting Python with different database applications, you can read more about that in the following articles: