Pandas DataFrame get_value() – Explained with Examples

The get_value() method in Pandas was a convenient way to quickly access a single value from a DataFrame.

Note: The get_value() method has been deprecated since Pandas version 0.21.0 and removed in version 1.0.0. However, the _get_value() method can be used as a private alternative in newer versions. It is recommended to use the at[] and iat[] methods instead for accessing scalar values.

Despite this, understanding how get_value() worked can still be useful for maintaining older codebases. This blog will explain the get_value() method and demonstrate its modern alternatives.

What is get_value()?

The get_value() method allowed you to retrieve a single value from a DataFrame by specifying the row label and the column label. The syntax was:

Python
DataFrame.get_value(index, col)
  • index: The row label.
  • col: The column label.
Example of get_value()

Let’s create a sample DataFrame that we’ll use in our examples:

Python
import pandas as pd

# Creating a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

The output of the DataFrame is:

Bash
   A  B
0  1  4
1  2  5
2  3  6
Using get_value() (Deprecated)
Python
# Getting the value at row index 1 and column 'A'
value = df.get_value(1, 'A')
print("The value at index 1, column 'A':", value)

Output:

Bash
The value at index 1, column 'A': 2

In the above example, we retrieved the value at row index 1 and column ‘A’ using the deprecated get_value() method results in the value 2

Using _get_value()

In newer versions, _get_value() method can still be used to achieve similar functionality. This method is considered private, indicated by the leading underscore, and is generally not recommended for regular use because it may change or be removed in future versions without notice. However, it can be useful for maintaining older code.

Python
# Using _get_value() to get the value at row index 1 and column 'A'
value = df._get_value(1, 'A')
print("The value at index 1, column 'A' using _get_value():", value)

Output:

Python
The value at index 1, column 'A' using _get_value(): 2

Modern Alternatives

As get_value() is no longer available, you should use the at[] or iat[] accessors. These methods are more efficient and provide a cleaner syntax.

  • at[]: Access a single value for a row/column label pair.
  • iat[]: Access a single value for a row/column pair by integer position.
1. Using at[]

The at[] method is used for label-based scalar lookups. Here’s how you can use it:

Python
# Using at[] to get the value at row index 1 and column 'A'
value = df.at[1, 'A']
print("The value at index 1, column 'A' using at[]:", value)

Output:

Bash
The value at index 1, column 'A' using at[]: 2

Here, we used at[] accessor to get the value at row index 1 and column ‘A’ also returns the value 2:

2. Using iat[]

The iat[] method is used for position-based scalar lookups. Here’s how you can use it:

Python
# Using iat[] to get the value at row index 1 (2nd row) and column 0 (1st column)
value = df.iat[1, 0]
print("The value at index 1, column 0 using iat[]:", value)

Output:

Bash
The value at index 1, column 0 using iat[]: 2

In the above example, we used iat[] accessor to get the value at row index 1 (2nd row) and column 0 (1st column) results in the same value, 2:


Performance Considerations

Both at[] and iat[] are optimized for getting and setting individual values, making them faster than general indexing methods like loc[] and iloc[] for scalar access. Here’s a quick comparison:

Python
import time

# Timing at[]
start = time.time()
for _ in range(100000):
    value = df.at[1, 'A']
end = time.time()
print("Time taken by at[]:", end - start)

# Timing iat[]
start = time.time()
for _ in range(100000):
    value = df.iat[1, 0]
end = time.time()
print("Time taken by iat[]:", end - start)

# Timing loc[]
start = time.time()
for _ in range(100000):
    value = df.loc[1, 'A']
end = time.time()
print("Time taken by loc[]:", end - start)

# Timing iloc[]
start = time.time()
for _ in range(100000):
    value = df.iloc[1, 0]
end = time.time()
print("Time taken by iloc[]:", end - start)

The output will vary slightly each time you run it, but here is an example of what you might see:

Bash
Time taken by at[]: 0.784569501876831
Time taken by iat[]: 0.7029356956481934
Time taken by loc[]: 1.1459879875183105
Time taken by iloc[]: 1.1324458122253418

The timing results show that at[] and iat[] are faster than loc[] and iloc[] for accessing individual values, with at[] taking around 0.78 seconds, iat[] around 0.70 seconds, loc[] around 1.14 seconds, and iloc[] around 1.13 seconds for 100,000 iterations

Conclusion

While get_value() was a useful method in older versions of Pandas, it has been deprecated and replaced by the more efficient at[] and iat[] methods. Using these modern alternatives not only ensures compatibility with newer versions of Pandas but also improves the performance of your code. By understanding and adapting to these changes, you can maintain and enhance your data manipulation skills in Python.

Also Explore:

Leave a Comment