There is more than one way to import a CSV file to SQL Server using Python. In this guide, I’ll show you a technique to import your data using the following 2 Python libraries:
- Pandas – used to import the CSV file into Python and create a DataFrame
- Pyodbc – used to connect Python to SQL Server
Steps to Import a CSV file to SQL Server using Python
Step 1: Prepare the CSV File
To begin, prepare the CSV file that you’d like to import to SQL Server.
For example, I prepared a simple CSV file with the following data:
Where:
- The CSV file name is ‘People’
- The CSV file is stored on my computer under the following path: C:\Users\Ron\Desktop\Test
Step 2: Import the CSV File into the DataFrame
Next, import the CSV file into Python using the pandas library.
Here is the code that I used to import the CSV file, and then create the DataFrame. You’ll need to change the path name to reflect the location where the CSV file is stored on your computer.
import pandas as pd data = pd.read_csv (r'C:\Users\Ron\Desktop\Test\People.csv') df = pd.DataFrame(data, columns= ['Name','Country','Age']) print(df)
This is how the DataFrame would look like in Python:
Step 3: Connect Python to SQL Server
Now, connect Python to SQL Server.
To connect Python to SQL server, you’ll need the:
- Server Name. In my case, the server name is: RON\SQLEXPRESS
- Database Name. In my case, the database name is: TestDB
For our example, the connection string to SQL server would look like this:
import pyodbc conn = pyodbc.connect('Driver={SQL Server};' 'Server=RON\SQLEXPRESS;' 'Database=TestDB;' 'Trusted_Connection=yes;') cursor = conn.cursor()
You may wish to check the following guide that explains the full steps to connect Python to SQL Server using pyodbc.
Step 4: Create a Table in SQL Server using Python
Next, add the syntax to create the table in SQL Server. This table will be used to store the imported data from the CSV file.
For example, you can add the following syntax to create the ‘people_info‘ table:
cursor.execute('CREATE TABLE people_info (Name nvarchar(50), Country nvarchar(50), Age int)')
Note that whenever you run the code to create a table. You should only use that piece of the code once. Otherwise, you’ll get the error below:
ProgrammingError: (’42S01′, “[42S01] [Microsoft][ODBC SQL Server Driver][SQL Server]There is already an object named ‘people_info’ in the database. (2714) (SQLExecDirectW)”)
Step 5: Insert the DataFrame Data into the Table
Here is the syntax to insert the DataFrame data (from step-2) into the people_info table:
for row in df.itertuples(): cursor.execute(''' INSERT INTO TestDB.dbo.people_info (Name, Country, Age) VALUES (?,?,?) ''', row.Name, row.Country, row.Age ) conn.commit()
And here is the entire code to import the CSV file to SQL Server using Python:
import pandas as pd import pyodbc # Import CSV data = pd.read_csv (r'C:\Users\Ron\Desktop\Test\People.csv') df = pd.DataFrame(data, columns= ['Name','Country','Age']) # Connect to SQL Server conn = pyodbc.connect('Driver={SQL Server};' 'Server=RON\SQLEXPRESS;' 'Database=TestDB;' 'Trusted_Connection=yes;') cursor = conn.cursor() # Create Table cursor.execute('CREATE TABLE people_info (Name nvarchar(50), Country nvarchar(50), Age int)') # Insert DataFrame to Table for row in df.itertuples(): cursor.execute(''' INSERT INTO TestDB.dbo.people_info (Name, Country, Age) VALUES (?,?,?) ''', row.Name, row.Country, row.Age ) conn.commit()
Run the code in Python (after making the adjustment to the path where your CSV file is stored, as well as making the change to your database connection info).
You will then see the new table in SQL Server (after few seconds when you perform a refresh in SQL Server):
Step 6: Perform a Test
Let’s run a simple query to check that the values from the CSV file got imported into SQL Server:
select * from TestDB.dbo.people_info
That should be indeed the case: