Pandas append() method – Explained with examples

In the world of data manipulation and analysis, Pandas is a powerful and versatile library in Python. One of the many operations you often need to perform is appending data to a DataFrame. Pandas provides a handy method called append() for this purpose. In this blog post, we will delve into the intricacies of the pandas.DataFrame.append() method, exploring its usage, syntax, and practical examples.

Despite its deprecation, we will explore the append method in this blog for educational purposes. Understanding its functionality can be beneficial when working with legacy code or older datasets.

Introduction to pandas append() method

The append() method in Pandas allows you to concatenate two or more DataFrames or Series along rows. This means you can add new rows to an existing DataFrame easily.

Syntax
Python
DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False)

  • other: DataFrame or Series/dict-like object, or list of these
    • The data to append. It can be another DataFrame, Series, dictionary-like object, or a list containing these objects.
  • ignore_index: bool, default False
    • If True, the resulting DataFrame will have a continuous integer index. If False, the original indices are retained.
  • verify_integrity: bool, default False
    • If True, it checks if the new index duplicates are allowed. If there are duplicates, it raises a ValueError.
  • sort: bool, default False
    • If True, it sorts the columns if the columns of the concatenated objects are not aligned.
Returns
  • DataFrame: A new DataFrame consisting of the original DataFrame with the other appended to it.

Practical Examples

Example 1: Appending a DataFrame to another DataFrame
Python
import pandas as pd

# Create two DataFrames
df1 = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6]
})

df2 = pd.DataFrame({
    'A': [7, 8, 9],
    'B': [10, 11, 12]
})

# Append df2 to df1
result = df1.append(df2, ignore_index=True)
print(result)

Output:

Bash
   A   B
0  1   4
1  2   5
2  3   6
3  7  10
4  8  11
5  9  12

In this example, we created two DataFrames df1 and df2, and then used the append() method to combine them. By setting ignore_index=True, we ensure the index in the resulting DataFrame is continuous.

Example 2: Appending a Series to a DataFrame
Python
# Create a DataFrame
df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6]
})

# Create a Series
s = pd.Series([7, 8], index=['A', 'B'])

# Append Series to DataFrame
result = df.append(s, ignore_index=True)
print(result)

Output:

Bash
   A  B
0  1  4
1  2  5
2  3  6
3  7  8

Here, a Series s is appended to the DataFrame df. Notice that the Series is appended as a new row.

Example 3: Appending with Different Columns
Bash
# Create two DataFrames with different columns
df1 = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6]
})

df2 = pd.DataFrame({
    'A': [7, 8, 9],
    'C': [10, 11, 12]
})

# Append df2 to df1
result = df1.append(df2, ignore_index=True, sort=False)
print(result)

Output:

Bash
     A    B     C
0  1.0  4.0   NaN
1  2.0  5.0   NaN
2  3.0  6.0   NaN
3  7.0  NaN  10.0
4  8.0  NaN  11.0
5  9.0  NaN  12.0

In this example, df1 and df2 have different columns. The resulting DataFrame contains all columns from both DataFrames, with NaN (Not a Number) values filling in the gaps where the columns do not align.

Example 4: Verifying Integrity
Python
# Create two DataFrames with overlapping index
df1 = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6]
}, index=[0, 1, 2])

df2 = pd.DataFrame({
    'A': [7, 8, 9],
    'B': [10, 11, 12]
}, index=[1, 2, 3])

# Append df2 to df1 with verify_integrity=True
try:
    result = df1.append(df2, verify_integrity=True)
except ValueError as e:
    print(e)

Output:

Bash
Indexes have overlapping values: [1, 2]

Setting verify_integrity=True ensures that no duplicate indices are present in the resulting DataFrame. In this case, since df1 and df2 have overlapping indices, a ValueError is raised.

Conclusion

The pandas.DataFrame.append() method is a powerful and flexible tool for appending data to a DataFrame. It supports appending DataFrames, Series, and dictionary-like objects, handling different column alignments, and ensuring index integrity. However, it’s important to note that append() is not an in-place operation; it returns a new DataFrame. For performance-critical applications, especially when dealing with large datasets, consider using pd.concat() as an alternative.

Understanding how to use append() effectively can significantly enhance your data manipulation tasks in Pandas, making your code cleaner and more efficient. Happy coding!

Leave a Comment