In this short guide, you’ll see the steps to select rows from Pandas DataFrame based on the conditions specified.
Steps to Select Rows from Pandas DataFrame
Step 1: Gather your data
Firstly, you’ll need to gather your data. Here is an example of a data gathered about boxes:
Color | Shape | Price |
Green | Rectangle | 10 |
Green | Rectangle | 15 |
Green | Square | 5 |
Blue | Rectangle | 5 |
Blue | Square | 10 |
Red | Square | 15 |
Red | Square | 15 |
Red | Rectangle | 5 |
Step 2: Create a DataFrame
Once you have your data ready, you’ll need to create a DataFrame to capture that data in Python.
For our example, you may use the code below to create a DataFrame:
import pandas as pd data = {'Color': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'Shape': ['Rectangle', 'Rectangle', 'Square', 'Rectangle', 'Square', 'Square', 'Square', 'Rectangle'], 'Price': [10, 15, 5, 5, 10, 15, 15, 5] } df = pd.DataFrame(data) print(df)
Run the code in Python and you’ll see this DataFrame:
Color Shape Price
0 Green Rectangle 10
1 Green Rectangle 15
2 Green Square 5
3 Blue Rectangle 5
4 Blue Square 10
5 Red Square 15
6 Red Square 15
7 Red Rectangle 5
Step 3: Select Rows from Pandas DataFrame
You can use the following logic to select rows from Pandas DataFrame based on specified conditions:
df.loc[df[‘column name’] condition]
For example, if you want to get the rows where the color is green, then you’ll need to apply:
df.loc[df[‘Color’] == ‘Green’]
Where:
- Color is the column name
- Green is the condition
And here is the full Python code for our example:
import pandas as pd data = {'Color': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'Shape': ['Rectangle', 'Rectangle', 'Square', 'Rectangle', 'Square', 'Square', 'Square', 'Rectangle'], 'Price': [10, 15, 5, 5, 10, 15, 15, 5] } df = pd.DataFrame(data) select_color = df.loc[df['Color'] == 'Green'] print(select_color)
Once you run the code, you’ll get the rows where the color is green:
Color Shape Price
0 Green Rectangle 10
1 Green Rectangle 15
2 Green Square 5
Additional Examples of Selecting Rows from Pandas DataFrame
Let’s now review additional examples to get a better sense of selecting rows from Pandas DataFrame.
Example 1: Select rows where the price is equal or greater than 10
To get all the rows where the price is equal or greater than 10, you’ll need to apply this condition:
df.loc[df[‘Price’] >= 10]
And this is the complete Python code:
import pandas as pd data = {'Color': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'Shape': ['Rectangle', 'Rectangle', 'Square', 'Rectangle', 'Square', 'Square', 'Square', 'Rectangle'], 'Price': [10, 15, 5, 5, 10, 15, 15, 5] } df = pd.DataFrame(data) select_price = df.loc[df['Price'] >= 10] print(select_price)
Run the code, and you’ll get all the rows where the price is equal or greater than 10:
Color Shape Price
0 Green Rectangle 10
1 Green Rectangle 15
4 Blue Square 10
5 Red Square 15
6 Red Square 15
Example 2: Select rows where the color is green AND the shape is rectangle
Now the goal is to select rows based on two conditions:
- Color is green; and
- Shape is rectangle
You may then use the & symbol to apply multiple conditions. In our example, the code would look like this:
df.loc[(df[‘Color’] == ‘Green’) & (df[‘Shape’] == ‘Rectangle’)]
Putting everything together:
import pandas as pd data = {'Color': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'Shape': ['Rectangle', 'Rectangle', 'Square', 'Rectangle', 'Square', 'Square', 'Square', 'Rectangle'], 'Price': [10, 15, 5, 5, 10, 15, 15, 5] } df = pd.DataFrame(data) color_and_shape = df.loc[(df['Color'] == 'Green') & (df['Shape'] == 'Rectangle')] print(color_and_shape)
Run the code and you’ll get the rows with the green color and rectangle shape:
Color Shape Price
0 Green Rectangle 10
1 Green Rectangle 15
Example 3: Select rows where the color is green OR the shape is rectangle
You can also select the rows based on one condition or another. For instance, you can select the rows if the color is green or the shape is rectangle.
To achieve this goal, you can use the | symbol as follows:
df.loc[(df[‘Color’] == ‘Green’) | (df[‘Shape’] == ‘Rectangle’)]
And here is the complete Python code:
import pandas as pd data = {'Color': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'Shape': ['Rectangle', 'Rectangle', 'Square', 'Rectangle', 'Square', 'Square', 'Square', 'Rectangle'], 'Price': [10, 15, 5, 5, 10, 15, 15, 5] } df = pd.DataFrame(data) color_or_shape = df.loc[(df['Color'] == 'Green') | (df['Shape'] == 'Rectangle')] print(color_or_shape)
Here is the result, where the color is green or the shape is rectangle:
Color Shape Price
0 Green Rectangle 10
1 Green Rectangle 15
2 Green Square 5
3 Blue Rectangle 5
7 Red Rectangle 5
Example 4: Select rows where the price is not equal to 15
You can use the combination of symbols != to select the rows where the price is not equal to 15:
df.loc[df[‘Price’] != 15]
import pandas as pd data = {'Color': ['Green', 'Green', 'Green', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 'Shape': ['Rectangle', 'Rectangle', 'Square', 'Rectangle', 'Square', 'Square', 'Square', 'Rectangle'], 'Price': [10, 15, 5, 5, 10, 15, 15, 5] } df = pd.DataFrame(data) not_equal_to = df.loc[df['Price'] != 15] print(not_equal_to)
Once you run the code, you’ll get all the rows where the price is not equal to 15:
Color Shape Price
0 Green Rectangle 10
2 Green Square 5
3 Blue Rectangle 5
4 Blue Square 10
7 Red Rectangle 5
Finally, the following source provides additional information about indexing and selecting data.
Great content.