Pandas is an essential library in Python for data manipulation and analysis. One of the fundamental aspects of Pandas is the DataFrame, which is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). To effectively work with data in DataFrames, it’s crucial to understand how to access and manipulate data. One powerful way to do this is by using the iloc property.
What is iloc?
The iloc
property in Pandas stands for “integer location” and is used for integer-based indexing. It allows you to select data by specifying row and column positions (using integers) rather than labels. This makes it particularly useful when the DataFrame’s index and column labels are not numeric.
Basic Usage
The basic syntax for using iloc
is:
df.iloc[row_indexer, column_indexer]
Here:
row_indexer
refers to the rows you want to select.column_indexer
refers to the columns you want to select.
Selecting Rows and Columns
Let’s dive into some practical examples to illustrate the versatility of the iloc
property.
Example DataFrame
First, let’s create a sample DataFrame to work with:
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [24, 27, 22, 32, 29],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}
df = pd.DataFrame(data)
print(df)
This will output:
Name Age City
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
3 David 32 Houston
4 Eve 29 Phoenix
Selecting a Single Row
To select the first row:
print(df.iloc[0])
Output:
Name Alice
Age 24
City New York
Name: 0, dtype: object
Selecting Multiple Rows
To select the first three rows:
print(df.iloc[0:3])
Output:
Name Age City
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
Selecting a Single Column
To select the first column:
print(df.iloc[:, 0])
Output:
0 Alice
1 Bob
2 Charlie
3 David
4 Eve
Name: Name, dtype: object
Selecting Multiple Columns
To select the first two columns:
print(df.iloc[:, 0:2])
Output:
Name Age
0 Alice 24
1 Bob 27
2 Charlie 22
3 David 32
4 Eve 29
Selecting Specific Rows and Columns
To select specific rows and columns (e.g., rows 0, 1 and columns 0, 2):
print(df.iloc[0:2, [0, 2]])
Output:
Name City
0 Alice New York
1 Bob Los Angeles
Using Negative Indexing
Negative indexing can be used to select rows or columns from the end. For example, to select the last row:
print(df.iloc[-1])
Output:
Name Eve
Age 29
City Phoenix
Name: 4, dtype: object
Conditional Selection with iloc
While iloc
primarily uses integer-based indexing, you can combine it with boolean conditions to filter data. For example, to select rows where the age is greater than 25:
print(df.iloc[(df['Age'] > 25).values])
Output:
Name Age City
1 Bob 27 Los Angeles
3 David 32 Houston
4 Eve 29 Phoenix
Conclusion
The iloc
property is a powerful tool in Pandas that provides flexible and efficient ways to access and manipulate data within DataFrames using integer-based indexing. Whether you need to select specific rows and columns, perform conditional filtering, or navigate your data using negative indexing, iloc
has you covered.
By mastering iloc
, you can enhance your data manipulation capabilities and handle a wide range of data analysis tasks more effectively. Remember, practice is key, so try experimenting with different datasets and scenarios to become proficient in using iloc
.
Happy data wrangling!
Explore Also : Python Pandas DataFrame .loc[] Method