How to Convert Strings to Datetime in Pandas DataFrame

You may use this template in order to convert strings to datetime in Pandas DataFrame:

df['DataFrame Column'] = pd.to_datetime(df['DataFrame Column'], format=specify your format)

Note that the strings must match the format specified. Later, you’ll see several scenarios for different formats.

Steps to Convert Strings to Datetime in Pandas DataFrame

Step 1: Collect the Data to be Converted

To begin, collect the data that you’d like to convert to datetime.

For example, here is a simple dataset about 3 different dates (with a format of yyyymmdd), when a store might be opened or closed:

DatesStatus
20190902Opened
20190913Opened
20190921Closed

Step 2: Create the DataFrame

Next, create a DataFrame to capture the above data in Python. You can capture the dates as strings by placing quotes around the values under the ‘dates’ column:

import pandas as pd

values = {'dates':  ['20190902','20190913','20190921'],
          'status': ['Opened','Opened','Closed']
          }

df = pd.DataFrame(values, columns = ['dates','status'])

print (df)
print (df.dtypes)

Run the code in Python, and you’ll get this DataFrame:

How to Convert Strings to Datetime in Pandas DataFrame

Notice that the ‘dates’ were indeed stored as strings (represented by object).

Step 3: Convert the Strings to Datetime in the DataFrame

You may then use the template below in order to convert the strings to datetime in Pandas DataFrame:

df['DataFrame Column'] = pd.to_datetime(df['DataFrame Column'], format=specify your format)

Recall that for our example, the date format is yyyymmdd.

This date format can be represented as:

format='%Y%m%d'

Note that the strings data (yyyymmdd) must match the format specified (%Y%m%d). You may refer to the following source for the different formats that you may apply.

For our example, the complete Python code to convert the strings to datetime would be:

import pandas as pd

values = {'dates':  ['20190902','20190913','20190921'],
          'status': ['Opened','Opened','Closed']
          }

df = pd.DataFrame(values, columns = ['dates','status'])

df['dates'] = pd.to_datetime(df['dates'], format='%Y%m%d')

print (df)
print (df.dtypes)

You’ll see that the data type for the ‘dates’ column is now datetime:

How to Convert Strings to Datetime in Pandas DataFrame

Note that when applying pd.to_datetime, the default format is yyyymmdd. So in the above particular example, you could remove the format =’%Y%m%d’ from the code. However, in other scenarios, as you’ll see below, you must specify the correct format to match with the strings data.

Converting Additional Formats

Let’s say that the dates are now formatted as ddmmyyyy:

DatesStatus
02092019Opened
13092019Opened
21092019Closed

In that case, you’ll need to apply the format below (for reference, check the following table to identify the correct format that you should apply):

format='%d%m%Y'

Here is the complete Python code:

import pandas as pd

values = {'dates':  ['02092019','13092019','21092019'],
          'status': ['Opened','Opened','Closed']
          }

df = pd.DataFrame(values, columns = ['dates','status'])

df['dates'] = pd.to_datetime(df['dates'], format='%d%m%Y')

print (df)
print (df.dtypes)

As before, your strings will now get converted to datetime:

Convert Strings to Datetime in Pandas DataFrame

What if your dates have a ddmmmyyyy format (e.g., 02Sep2019)?

You’ll then need to apply the format below (by changing the ‘m’ to ‘b’):

format='%d%b%Y' 

So your complete Python code would look like this:

import pandas as pd

values = {'dates':  ['02Sep2019','13Sep2019','21Sep2019'],
          'status': ['Opened','Opened','Closed']
          }

df = pd.DataFrame(values, columns = ['dates','status'])

df['dates'] = pd.to_datetime(df['dates'], format='%d%b%Y')

print (df)
print (df.dtypes)

You’ll now get the datetime format:

dataframe with dates

Let’s say that your dates now contain dashes (e.g., ’02-Sep-2019′) .

In that case, simply add those dashes as follows:

format='%d-%b-%Y'

Here is the full Python code:

import pandas as pd

values = {'dates':  ['02-Sep-2019','13-Sep-2019','21-Sep-2019'],
          'status': ['Opened','Opened','Closed']
          }

df = pd.DataFrame(values, columns = ['dates','status'])

df['dates'] = pd.to_datetime(df['dates'], format='%d-%b-%Y')

print (df)
print (df.dtypes)

And the result:

dataframe containing dates

Formats with Timestamps

Suppose that your strings contain both the dates and times:

DatesStatus
20190902093000Opened
20190913093000Opened
20190921200000Closed

In that case, the format that should be specified is:

format='%Y%m%d%H%M%S'

So the full Python code would be:

import pandas as pd

values = {'dates':  ['20190902093000','20190913093000','20190921200000'],
          'status': ['Opened','Opened','Closed']
          }

df = pd.DataFrame(values, columns = ['dates','status'])

df['dates'] = pd.to_datetime(df['dates'], format='%Y%m%d%H%M%S')

print (df)
print (df.dtypes)

You’ll now see the datetime format:

datetime format

Now let’s say that the strings contain characters, such as the dash character (“-“) to separate between the date and the time:

DatesStatus
20190902-093000Opened
20190913-093000Opened
20190921-200000Closed

In that scenario, the format should include the dash as well:

format='%Y%m%d-%H%M%S'

Here is the full Python code:

import pandas as pd

values = {'dates':  ['20190902-093000','20190913-093000','20190921-200000'],
          'status': ['Opened','Opened','Closed']
          }

df = pd.DataFrame(values, columns = ['dates','status'])

df['dates'] = pd.to_datetime(df['dates'], format='%Y%m%d-%H%M%S')

print (df)
print (df.dtypes)

And the result:

Example of datetime