How to Select Rows from Pandas DataFrame?

Selecting rows from a Pandas DataFrame is a common task in data analysis and manipulation. Pandas provides various methods to accomplish this, catering to different needs and scenarios. In this blog, we’ll explore different techniques to select rows from a DataFrame with practical examples.

Techniques we use to select rows from a DataFrame include:

  1. Selecting Rows by Label
  2. Selecting Rows by Position
  3. Selecting Rows Based on Conditions
  4. Selecting Rows Using query() Method
  5. Selecting Rows with isin() Method
Importing Pandas and Creating a Sample DataFrame

First, let’s import the Pandas library and create a sample DataFrame for our examples:

Python
import pandas as pd

# Sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [24, 27, 22, 32, 29],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}

df = pd.DataFrame(data)
print(df)

This will create a DataFrame like this:

Markdown
      Name  Age         City
0    Alice   24     New York
1      Bob   27  Los Angeles
2  Charlie   22      Chicago
3    David   32      Houston
4      Eve   29      Phoenix
1. Selecting Rows by Label

You can select rows by their labels using the loc property. This is useful when you have a specific row label or index.

Example
Python
# Select row with label 2
row_by_label = df.loc[2]
print(row_by_label)

Output:

Markdown
Name    Charlie
Age          22
City    Chicago
Name: 2, dtype: object

In this example, we select the row with the label (index) 2 using the loc property, which is useful when you know the specific label of the row you want to access.

2. Selecting Multiple Rows by Label

You can also select multiple rows by providing a list of labels:

Python
# Select rows with labels 1 and 3
rows_by_labels = df.loc[[1, 3]]
print(rows_by_labels)

Output:

Markdown
    Name  Age         City
1    Bob   27  Los Angeles
3  David   32      Houston

This example selects multiple rows with the specified labels (1 and 3) using the loc property, allowing you to access multiple rows by their labels.

3. Selecting Rows by Position

To select rows by their integer position, use the iloc property. This is useful when you need to access rows by their position in the DataFrame.

Example
Python
# Select row at position 1
row_by_position = df.iloc[1]
print(row_by_position)

Output:

Markdown
Name            Bob
Age              27
City    Los Angeles
Name: 1, dtype: object

In this example, we select the row at the given position (1) using the iloc property, which is useful when you need to access rows based on their integer position in the DataFrame.

4. Selecting Multiple Rows by Position

Similar to labels, you can select multiple rows by providing a list of positions:

Python
# Select rows at positions 0 and 2
rows_by_positions = df.iloc[[0, 2]]
print(rows_by_positions)

Output:

Markdown
      Name  Age      City
0    Alice   24  New York
2  Charlie   22   Chicago

This example selects multiple rows at the specified positions (0 and 2) using the iloc property, enabling you to access rows based on their positions

5. Selecting Rows Based on Conditions

You can use conditional expressions to filter rows. This method is powerful for selecting rows that meet specific criteria.

Example
Python
# Select rows where Age is greater than 25
rows_condition = df[df['Age'] > 25]
print(rows_condition)

Output:

Markdown
   Name  Age         City
1   Bob   27  Los Angeles
3 David   32      Houston
4   Eve   29      Phoenix

In this example, we filtered rows where the ‘Age’ column is greater than 25, demonstrating how to use conditional expressions to select rows based on column values.

6. Combining Multiple Conditions

You can combine multiple conditions using the logical operators & (and), | (or), and ~ (not).

Python
# Select rows where Age is greater than 25 and City is not 'Los Angeles'
rows_multiple_conditions = df[(df['Age'] > 25) & (df['City'] != 'Los Angeles')]
print(rows_multiple_conditions)

Output:

Markdown
    Name  Age     City
3  David   32  Houston
4    Eve   29  Phoenix

This example combines multiple conditions using logical operators to filter rows where ‘Age’ is greater than 25 and ‘City’ is not ‘Los Angeles’, showing how to use complex conditions.

7. Selecting Rows Using query() Method

The query() method provides a more readable way to filter rows based on conditions. This method is particularly useful for complex conditions.

Example
Python
# Select rows where Age is less than 30
rows_query = df.query('Age < 30')
print(rows_query)

Output:

Markdown
      Name  Age         City
0    Alice   24     New York
1      Bob   27  Los Angeles
2  Charlie   22      Chicago
4      Eve   29      Phoenix

This example uses the query() method to filter rows where ‘Age’ is less than 30, providing a more readable way to apply conditions on DataFrame columns.

8. Selecting Rows with isin() Method

The isin() method allows you to filter rows based on whether a column’s values are in a provided list.

Example
Python
# Select rows where City is either 'Chicago' or 'Houston'
rows_isin = df[df['City'].isin(['Chicago', 'Houston'])]
print(rows_isin)

Output:

Markdown
      Name  Age     City
2  Charlie   22  Chicago
3    David   32  Houston

This example uses the isin() method to select rows where the ‘City’ column matches either ‘Chicago’ or ‘Houston’, showing how to filter rows based on a list of values.

Conclusion

Selecting rows from a Pandas DataFrame is a fundamental operation for data manipulation and analysis. Pandas provides several methods to achieve this, each suited for different scenarios. Whether you need to select rows by labels, positions, conditions, or specific values, Pandas has you covered.

By mastering these techniques, you’ll be able to efficiently filter and manipulate your data, making your data analysis tasks more effective and streamlined.


I hope you find this blog post helpful. If you have any questions or suggestions, please leave a comment below.

Also Explore:

Leave a Comment