8 ways to apply LEFT, RIGHT, MID in Pandas

At times, you may need to extract specific characters within a string. You may then apply the concepts of Left, Right, and Mid in Pandas to obtain your desired characters within a string.

In this tutorial, you’ll see the following 8 scenarios that describe how to extract specific characters:

  1. From the left
  2. From the right
  3. From the middle
  4. Before a symbol
  5. Before a space
  6. After a symbol
  7. Between identical symbols
  8. Between different symbols

Reviewing LEFT, RIGHT, MID in Pandas

For each of the above scenarios, the goal is to extract only the digits within the string. For example, for the string of ‘55555-abc‘ the goal is to extract only the digits of 55555.

Let’s now review the first case of obtaining only the digits from the left.

Scenario 1: Extract Characters From the Left

Suppose that you have the following 3 strings:

Identifier
55555-abc
77777-xyz
99999-mmm

You can capture those strings in Python using Pandas DataFrame.

Since you’re only interested to extract the five digits from the left, you may then apply the syntax of str[:5] to the ‘Identifier’ column:

import pandas as pd

data = {'Identifier': ['55555-abc','77777-xyz','99999-mmm']}
df = pd.DataFrame(data, columns= ['Identifier'])
left = df['Identifier'].str[:5]

print (left)

Once you run the Python code, you’ll get only the digits from the left:

0    55555
1    77777
2    99999

Scenario 2: Extract Characters From the Right

In this scenario, the goal is to get the five digits from the right:

Identifier
ID-55555
ID-77777
ID-99999

To accomplish this goal, apply str[-5:] to the ‘Identifier’ column:

import pandas as pd

data = {'Identifier': ['ID-55555','ID-77777','ID-99999']}
df = pd.DataFrame(data, columns= ['Identifier'])
right = df['Identifier'].str[-5:]

print (right)

This will ensure that you’ll get the five digits from the right:

0    55555
1    77777
2    99999

Scenario 3: Extract Characters From the Middle

There are cases where you may need to extract the data from the middle of a string:

Identifier
ID-55555-End
ID-77777-End
ID-99999-End

To extract only the digits from the middle, you’ll need to specify the starting and ending points for your desired characters. In this case, the starting point is ‘3’ while the ending point is ‘8’ so you’ll need to apply str[3:8] as follows:

import pandas as pd

data = {'Identifier': ['ID-55555-End','ID-77777-End','ID-99999-End']}
df = pd.DataFrame(data, columns= ['Identifier'])
mid = df['Identifier'].str[3:8]

print (mid)

Only the five digits within the middle of the string will be retrieved:

0    55555
1    77777
2    99999

Scenario 4: Before a symbol

Say that you want to obtain all the digits before the dash symbol (‘-‘):

Identifier
111-IDAA
2222222-IDB
33-IDCCC

Even if your string length changes, you can still retrieve all the digits from the left by adding the two components below:

  • str.split(‘-‘) – where you’ll need to place the symbol within the brackets. In our case, it is the dash symbol
  • str[0] – where you’ll need to place 0 to get the characters from the left

And here is the complete Python code:

import pandas as pd

data = {'Identifier': ['111-IDAA','2222222-IDB','33-IDCCC']}
df = pd.DataFrame(data, columns= ['Identifier'])
before_symbol = df['Identifier'].str.split('-').str[0]

print (before_symbol)

And the result:

0        111
1    2222222
2         33

Scenario 5: Before a space

What if you have a space within the string?

Identifier
111 IDAA
2222222 IDB
33 IDCCC

In that case, simply leave a blank space within the split: str.split(‘ ‘)

import pandas as pd

data = {'Identifier': ['111 IDAA','2222222 IDB','33 IDCCC']}
df = pd.DataFrame(data, columns= ['Identifier'])
before_space = df['Identifier'].str.split(' ').str[0]

print (before_space)

Only the digits from the left will be obtained:

0        111
1    2222222
2         33

Scenario 6: After a symbol

You may also face situations where you’d like to get all the characters after a symbol (such as the dash symbol for example) for varying-length strings:

Identifier
IDAA-111
IDB-2222222
IDCCC-33

In this case, you’ll need to adjust the value within the str[] to 1, so that you’ll obtain the desired digits from the right:

import pandas as pd

data = {'Identifier': ['IDAA-111','IDB-2222222','IDCCC-33']}
df = pd.DataFrame(data, columns= ['Identifier'])
after_symbol = df['Identifier'].str.split('-').str[1]

print (after_symbol)

Here is the output from Python:

0        111
1    2222222
2         33

Scenario 7: Between identical symbols

Now what if you want to retrieve the values between two identical symbols (such as the dash symbols) for varying-length strings:

Identifier
IDAA-111-AA
IDB-2222222-B
IDCCC-33-CCC

In that case, set:

  • str.split(‘-‘)
  • str[1]

So your full Python code would look like this:

import pandas as pd

data = {'Identifier': ['IDAA-111-AA','IDB-2222222-B','IDCCC-33-CCC']}
df = pd.DataFrame(data, columns= ['Identifier'])
between_two_symbols = df['Identifier'].str.split('-').str[1]

print (between_two_symbols)

You’ll get all the digits between the two dash symbols:

0        111
1    2222222
2         33

Scenario 8: Between different symbols

For the final scenario, the goal is to obtain the digits between two different symbols (the dash symbol and the dollar symbol):

Identifier
IDAA-111$AA
IDB-2222222$B
IDCCC-33$CCC

To accomplish this goal:

  • First, set the variable (i.e., between_two_different_symbols) to obtain all the characters after the dash symbol
  • Then, set the same variable to obtain all the characters before the dollar symbol

This is how you code would look like:

import pandas as pd

data = {'Identifier': ['IDAA-111$AA','IDB-2222222$B','IDCCC-33$CCC']}
df = pd.DataFrame(data, columns= ['Identifier'])
between_two_different_symbols = df['Identifier'].str.split('-').str[1]
between_two_different_symbols = between_two_different_symbols.str.split('$').str[0]

print (between_two_different_symbols)

And the result:

0        111
1    2222222
2         33

Conclusion – LEFT, RIGHT, MID in Pandas

You just saw how to apply Left, Right, and Mid in Pandas. The concepts reviewed in this tutorial can be applied across large number of different scenarios.

You can find many examples about working with text data by visiting the Pandas Documentation.