8 ways to apply LEFT, RIGHT, MID in Pandas

In this short guide, you’ll see how to extract specific characters within a string using Pandas.

The goal, in each of the 8 scenarios below, is to extract only the digits within a string:

  1. From the left
  2. From the right
  3. From the middle
  4. Before a symbol
  5. Before a space
  6. After a symbol
  7. Between identical symbols
  8. Between different symbols

(1) Extract the five digits from the left using str[:5]:

import pandas as pd

data = {"Identifier": ["55555-abc", "77777-xyz", "99999-mmm"]}
df = pd.DataFrame(data)

df["Identifier"] = df["Identifier"].str[:5]

print(df)

The result:

  Identifier
0      55555
1      77777
2      99999

(2) Extract the five digits from the right using str[-5:]:

import pandas as pd

data = {"Identifier": ["ID-55555", "ID-77777", "ID-99999"]}
df = pd.DataFrame(data)

df["Identifier"] = df["Identifier"].str[-5:]

print(df)

This result:

  Identifier
0      55555
1      77777
2      99999

(3) Extract the five digits from the middle using str[3:8]:

import pandas as pd

data = {"Identifier": ["ID-55555-End", "ID-77777-End", "ID-99999-End"]}
df = pd.DataFrame(data)

df["Identifier"] = df["Identifier"].str[3:8]

print(df)

The result:

  Identifier
0      55555
1      77777
2      99999

(4) Extract the digits before a symbol (“-“) using str.split(“-“).str[0]:

import pandas as pd

data = {"Identifier": ["111-AA", "2222222-BB", "33-CC"]}
df = pd.DataFrame(data)

df["Identifier"] = df["Identifier"].str.split("-").str[0]

print(df)

The result:

  Identifier
0        111
1    2222222
2         33

(5) Extract the digits before a space (” “) using str.split(” “).str[0]:

import pandas as pd

data = {"Identifier": ["111 AA", "2222222 BB", "33 CC"]}
df = pd.DataFrame(data)

df["Identifier"] = df["Identifier"].str.split(" ").str[0]

print(df)

The result:

  Identifier
0        111
1    2222222
2         33

(6) Extract the digits after a symbol (“-“) using str.split(“-“).str[1]:

import pandas as pd

data = {"Identifier": ["AA-111", "BB-2222222", "CC-33"]}
df = pd.DataFrame(data)

df["Identifier"] = df["Identifier"].str.split("-").str[1]

print(df)

The result:

  Identifier
0        111
1    2222222
2         33

(7) Extract the digits between identical symbols (“-“) using str.split(“-“).str[1]:

import pandas as pd

data = {"Identifier": ["AA-111-AA", "BB-2222222-B", "CC-33-CCC"]}
df = pd.DataFrame(data)

df["Identifier"] = df["Identifier"].str.split("-").str[1]

print(df)

The result:

  Identifier
0        111
1    2222222
2         33

(8) Extract the digits between different symbols:

import pandas as pd

data = {"Identifier": ["AA-111$AA", "BB-2222222$B", "CC-33$CCC"]}
df = pd.DataFrame(data)

df["Identifier"] = df["Identifier"].str.split("-").str[1]
df["Identifier"] = df["Identifier"].str.split("$").str[0]

print(df)

The result:

  Identifier
0        111
1    2222222
2         33

You can find many examples about working with text data by visiting the Pandas Documentation.

Leave a Comment