The truncate() method in Pandas is a handy function used to trim data from a DataFrame, specifically rows or columns before and after specified index labels. This method is particularly useful for focusing on a particular subset of your data by removing unwanted rows or columns based on index positions.
Syntax
DataFrame.truncate(before=None, after=None, axis=None, copy=True)
Parameters
- before: Index or slice. Truncate all rows or columns before this index value.
- after: Index or slice. Truncate all rows or columns after this index value.
- axis: {0 or ‘index’, 1 or ‘columns’}, default 0. The axis to truncate. Truncate rows (0 or ‘index’) or columns (1 or ‘columns’).
- copy: bool, default True. Return a copy of the truncated section.
Returns
- truncated: DataFrame. A new DataFrame with the requested index labels removed.
Examples
Let’s dive into some examples to understand how the truncate()
method works.
1. Truncating Rows in a DataFrame
Consider a DataFrame with dates as index:
import pandas as pd
import numpy as np
# Create a date range and a DataFrame
dates = pd.date_range('20230101', periods=10)
df = pd.DataFrame(np.random.randn(10, 4), index=dates, columns=list('ABCD'))
print("Original DataFrame:\n", df)
# Truncate the DataFrame
truncated_df = df.truncate(before='2023-01-03', after='2023-01-07')
print("\nTruncated DataFrame:\n", truncated_df)
Output:
Original DataFrame:
A B C D
2023-01-01 -0.293685 0.859982 0.241670 -0.512058
2023-01-02 0.076887 0.277912 0.152936 0.089904
2023-01-03 0.395732 0.003297 0.715594 1.116074
2023-01-04 0.433142 0.138547 0.157818 0.342425
2023-01-05 -0.602209 0.361783 1.211707 -0.225115
2023-01-06 1.122211 -0.519022 -0.091196 0.178236
2023-01-07 -0.773126 0.270019 -0.562042 0.823486
2023-01-08 -0.379472 1.025029 0.233836 -1.053676
2023-01-09 0.244112 -0.576659 -0.254178 -1.022299
2023-01-10 -0.609700 0.398504 1.499126 -0.731157
Truncated DataFrame:
A B C D
2023-01-03 0.395732 0.003297 0.715594 1.116074
2023-01-04 0.433142 0.138547 0.157818 0.342425
2023-01-05 -0.602209 0.361783 1.211707 -0.225115
2023-01-06 1.122211 -0.519022 -0.091196 0.178236
2023-01-07 -0.773126 0.270019 -0.562042 0.823486
In this example, the truncate()
method removes all rows before 2023-01-03
and after 2023-01-07
. The result is a DataFrame containing only the rows from 2023-01-03
to 2023-01-07
.
2. Handling Non-DateTime Indices
The truncate()
method can also work with non-datetime indices. Here’s an example using a simple integer index:
# Create a DataFrame with integer index
df = pd.DataFrame(np.random.randn(10, 4), index=range(10), columns=list('ABCD'))
print("Original DataFrame:\n", df)
# Truncate the DataFrame
truncated_df = df.truncate(before=3, after=7)
print("\nTruncated DataFrame:\n", truncated_df)
Output:
Original DataFrame:
A B C D
0 -0.238380 0.235593 -1.187904 -0.779957
1 1.181031 0.535377 -1.099986 0.182283
2 0.134791 -0.341307 -0.568430 -0.728867
3 0.759418 -1.102792 0.168233 -0.021658
4 -0.422572 -1.536936 -0.242277 0.034944
5 0.053508 -1.255768 -0.434728 0.274743
6 -1.532616 -0.069097 0.103609 -0.085357
7 -0.397131 -1.455381 -0.466245 -0.019519
8 0.032062 -1.113439 0.575993 -0.760727
9 0.589331 0.754479 -0.496957 0.269780
Truncated DataFrame:
A B C D
3 0.759418 -1.102792 0.168233 -0.021658
4 -0.422572 -1.536936 -0.242277 0.034944
5 0.053508 -1.255768 -0.434728 0.274743
6 -1.532616 -0.069097 0.103609 -0.085357
7 -0.397131 -1.455381 -0.466245 -0.019519
In this case, the method removes all rows before index 3 and after index 7.
3. Truncating Columns in a DataFrame
You can also truncate columns by setting the axis
parameter:
# Create a DataFrame
df = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))
print("Original DataFrame:\n", df)
# Truncate the DataFrame
truncated_df = df.truncate(before='B', after='C', axis=1)
print("\nTruncated DataFrame:\n", truncated_df)
Output:
Original DataFrame:
A B C D
0 0.870720 -0.032728 0.123983 -1.125678
1 -0.091313 1.130923 0.423292 -0.692537
2 0.297814 0.025172 -0.235482 0.468553
3 -0.448503 1.242169 0.177493 -0.319169
4 0.096072 0.029631 -0.101747 -1.273148
5 -1.082981 -0.027579 0.457487 -0.601437
6 -1.405650 -0.382135 -0.463029 1.110195
7 1.342728 -0.380145 0.324506 0.201904
8 0.164034 0.038417 0.424095 0.231002
9 0.369102 -0.847314 -0.108466 -0.369654
Truncated DataFrame:
B C
0 -0.202946 1.500223
1 0.934438 -0.349047
2 -0.915476 -0.317586
3 -0.494213 -0.555470
4 -0.172072 -1.582175
5 0.570262 -0.433591
6 -1.195310 0.073806
7 0.339811 -0.478527
8 0.396930 0.260241
9 -1.076053 -0.702691
Here, the truncate()
method removes all columns before column B
and after column C
. The result is a DataFrame containing only the columns B
and C
Using the ‘copy’ Parameter
The copy
parameter in the truncate()
method determines whether the truncated DataFrame is a copy of the original DataFrame or a view of it. By default, copy=True
, which means the truncated DataFrame is a separate copy. If copy=False
, the truncated DataFrame is a view, and changes to it may affect the original DataFrame.
Here’s an example demonstrating the use of the copy
parameter:
import pandas as pd
import numpy as np
# Create a DataFrame with integer index
df = pd.DataFrame(np.random.randn(10, 4), index=range(10), columns=list('ABCD'))
print("Original DataFrame:\n", df)
# Truncate the DataFrame with copy=False
truncated_df = df.truncate(before=3, after=7, copy=False)
print("\nTruncated DataFrame (copy=False):\n", truncated_df)
# Modify the truncated DataFrame
truncated_df.iloc[0, 0] = 999
print("\nModified Truncated DataFrame (copy=False):\n", truncated_df)
print("\nOriginal DataFrame after modification (copy=False):\n", df)
# Truncate the DataFrame with copy=True
truncated_df_copy = df.truncate(before=3, after=7, copy=True)
print("\nTruncated DataFrame (copy=True):\n", truncated_df_copy)
# Modify the truncated DataFrame
truncated_df_copy.iloc[0, 0] = -999
print("\nModified Truncated DataFrame (copy=True):\n", truncated_df_copy)
print("\nOriginal DataFrame after modification (copy=True):\n", df)
Output:
Original DataFrame:
A B C D
0 0.265165 0.028710 1.038816 -0.350271
1 0.219786 -1.420820 -0.408030 -0.138795
2 -0.723772 -0.578556 0.859737 0.348882
3 -1.111425 0.602344 0.238018 1.038508
4 0.234110 0.417319 0.290739 -0.624643
5 -0.599482 -0.065902 0.338065 0.100595
6 0.252914 0.259424 0.142389 0.889366
7 0.078312 1.424189 0.601383 0.590601
8 1.185118 -1.003683 -0.344869 -0.149556
9 -0.133129 -0.272161 0.281761 -1.054102
Truncated DataFrame (copy=False):
A B C D
3 -1.111425 0.602344 0.238018 1.038508
4 0.234110 0.417319 0.290739 -0.624643
5 -0.599482 -0.065902 0.338065 0.100595
6 0.252914 0.259424 0.142389 0.889366
7 0.078312 1.424189 0.601383 0.590601
Modified Truncated DataFrame (copy=False):
A B C D
3 999.000000 0.602344 0.238018 1.038508
4 0.234110 0.417319 0.290739 -0.624643
5 -0.599482 -0.065902 0.338065 0.100595
6 0.252914 0.259424 0.142389 0.889366
7 0.078312 1.424189 0.601383 0.590601
Original DataFrame after modification (copy=False):
A B C D
0 0.265165 0.028710 1.038816 -0.350271
1 0.219786 -1.420820 -0.408030 -0.138795
2 -0.723772 -0.578556 0.859737 0.348882
3 999.000000 0.602344 0.238018 1.038508
4 0.234110 0.417319 0.290739 -0.624643
5 -0.599482 -0.065902 0.338065 0.100595
6 0.252914 0.259424 0.142389 0.889366
7 0.078312 1.424189 0.601383 0.590601
8 1.185118 -1.003683 -0.344869 -0.149556
9 -0.133129 -0.272161 0.281761 -1.054102
Truncated DataFrame (copy=True):
A B C D
3 -1.111425 0.602344 0.238018 1.038508
4 0.234110 0.417319 0.290739 -0.624643
5 -0.599482 -0.065902 0.338065 0.100595
6 0.252914 0.259424 0.142389 0.889366
7 0.078312 1.424189 0.601383 0.590601
Modified Truncated DataFrame (copy=True):
A B C D
3 -999.000000 0.602344 0.238018 1.038508
4 0.234110 0.417319 0.290739 -0.624643
5 -0.599482 -0.065902 0.338065 0.100595
6 0.252914 0.259424 0.142389 0.889366
7 0.078312 1.424189 0.601383 0.590601
Original DataFrame after modification (copy=True):
A B C D
0 0.265165 0.028710 1.038816 -0.350271
1 0.219786 -1.420820 -0.408030 -0.138795
2 -0.723772 -0.578556 0.859737 0.348882
3 999.000000 0.602344 0.238018 1.038508
4 0.234110 0.417319 0.290739 -0.624643
5 -0.599482 -0.065902 0.338065 0.100595
6 0.252914 0.259424 0.142389 0.889366
7 0.078312 1.424189 0.601383 0.590601
8 1.185118 -1.003683 -0.344869 -0.149556
9 -0.133129 -0.272161 0.281761 -1.054102
In this example:
- With
copy=False
, modifying the truncated DataFrame also changes the original DataFrame. - With
copy=True
, modifying the truncated DataFrame does not affect the original DataFrame, demonstrating that it is indeed a copy.
Important Notes
- If the
before
andafter
parameters are not specified, the original DataFrame is returned. - The
copy
parameter is by default set toTrue
, meaning that the truncated DataFrame is a copy of the original one. If set toFalse
, the truncated DataFrame will be a view of the original one, meaning changes to the truncated DataFrame will affect the original DataFrame.
Conclusion
The truncate()
method is a powerful tool for slicing DataFrames by index. Whether working with date indices, integer indices, or column names, this method can help you focus on the subset of your data that matters most. By understanding and utilizing the parameters before
, after
, and axis
, you can efficiently manage your DataFrame subsets for more effective data analysis.
Also Explore: