Pandas DataFrame stack() Method – Explained with Examples

The stack() method in Pandas is a powerful tool for reshaping DataFrames. It is primarily used to pivot the columns of a DataFrame into its index, resulting in a more compact form. This method is particularly useful for converting wide DataFrames into long formats, which are often easier to work with in various data analysis tasks.

In this blog, we will cover the following topics:

  1. Introduction to stack
  2. Basic usage of stack
  3. Using stack with multi-level columns
  4. Handling missing values with stack
  5. Practical examples

Introduction to stack

The stack method pivots the columns of a DataFrame into the index. This method is the opposite of the unstack method, which pivots the index levels into columns. The stack method primarily operates on DataFrames with hierarchical columns (MultiIndex).

Syntax
Python
DataFrame.stack(level=-1, dropna=True)
  • level: The level(s) to stack. Defaults to the innermost level (-1).
  • dropna: Whether to drop rows in the resulting DataFrame/Series with missing values. Defaults to True.

2. Basic Usage of stack

Let’s start with a simple example to understand the basic usage of stack.

Python
import pandas as pd

# Creating a simple DataFrame
df = pd.DataFrame({
    'A': {0: 'a', 1: 'b', 2: 'c'},
    'B': {0: 1, 1: 3, 2: 5},
    'C': {0: 2, 1: 4, 2: 6}
})

# Applying stack method
stacked_df = df.stack()

print(stacked_df)

Output:

0  A    a
   B    1
   C    2
1  A    b
   B    3
   C    4
2  A    c
   B    5
   C    6
dtype: object

In this example, the stack method has pivoted the columns A, B, and C into the index, creating a Series with a MultiIndex.

3. Using stack with Multi-Level Columns

The stack method is especially useful when dealing with DataFrames that have MultiIndex columns.

Python
import pandas as pd

# Creating a DataFrame with MultiIndex columns
columns = pd.MultiIndex.from_tuples([('A', 'cat'), ('A', 'dog'), ('B', 'cat'), ('B', 'dog')])
df_multi = pd.DataFrame([[1, 2, 3, 4], [5, 6, 7, 8]], columns=columns)
print(df_multi)

# Applying stack method
stacked_df_multi = df_multi.stack(level=0)

print(stacked_df_multi)

Output:

     A       B    
  cat dog cat dog
0   1   2   3   4
1   5   6   7   8

stacked_df_multi:
     cat  dog
0 A    1    2
  B    3    4
1 A    5    6
  B    7    8

In this example, the stack method has pivoted the first level of the columns into the index, resulting in a more compact DataFrame.

4. Handling Missing Values with stack

By default, the stack method drops rows with missing values. However, this behavior can be controlled using the dropna parameter.

Python
import pandas as pd

# Creating a DataFrame with missing values
df_missing = pd.DataFrame({
    'A': {0: 'a', 1: None, 2: 'c'},
    'B': {0: 1, 1: 3, 2: None},
    'C': {0: 2, 1: 4, 2: 6}
})

# Applying stack method with dropna=False
stacked_df_missing = df_missing.stack(dropna=False)

print(stacked_df_missing)

Output:

0  A      a
   B      1
   C      2
1  A    NaN
   B      3
   C      4
2  A      c
   B    NaN
   C      6
dtype: object

In this example, the stack method retains the rows with missing values because dropna is set to False.

5. Practical Examples

Converting a Wide DataFrame to Long Format
Python
import pandas as pd

# Creating a wide DataFrame
df_wide = pd.DataFrame({
    'ID': [1, 2, 3],
    'Math': [90, 80, 85],
    'Science': [85, 80, 95],
    'English': [78, 88, 92]
})

# Setting 'ID' as the index
df_wide.set_index('ID', inplace=True)

# Applying stack method to convert to long format
df_long = df_wide.stack().reset_index()
df_long.columns = ['ID', 'Subject', 'Score']

print(df_long)

Output:

   ID  Subject  Score
0   1     Math     90
1   1  Science     85
2   1  English     78
3   2     Math     80
4   2  Science     80
5   2  English     88
6   3     Math     85
7   3  Science     95
8   3  English     92

In this practical example, we converted a wide DataFrame into a long format using the stack method, which is useful for various data analysis tasks and visualizations.

Conclusion

The stack method in Pandas is a versatile tool for reshaping DataFrames. It allows you to pivot columns into the index, making your data more compact and easier to work with in long format. Whether you are dealing with simple DataFrames or those with hierarchical columns, the stack method can significantly enhance your data manipulation capabilities.

By mastering the stack method, you can efficiently transform and analyze your data, making your data science workflow more effective and streamlined.

Also Explore:

Leave a Comment