Pandas isnull() and notnull() Methods – Explained with examples

Handling missing data is a critical task in data analysis and manipulation. Pandas, a powerful data manipulation library in Python, provides two essential methods to detect missing values: isnull() and notnull(). In this blog, we will explore these methods, understand their usage, and look at practical examples.

What are Missing Values?

In a dataset, missing values can be represented in various ways, such as NaN (Not a Number), None, or even empty strings. Identifying and dealing with these missing values is crucial for accurate data analysis. Pandas offers the isnull() and notnull() methods to help you detect missing values efficiently.

isnull() Method

The isnull() method is used to detect missing values in a DataFrame or Series. It returns a DataFrame or Series of the same shape as the input, where each element is a boolean indicating whether the corresponding element is missing (True if missing, False otherwise).

Syntax
Python
DataFrame.isnull()
Series.isnull()
Example

Let’s consider a simple DataFrame to demonstrate the isnull() method:

Python
import pandas as pd
import numpy as np

data = {
    'A': [1, 2, np.nan, 4],
    'B': [5, np.nan, np.nan, 8],
    'C': [9, 10, 11, 12]
}

df = pd.DataFrame(data)
print(df)

Output:

Markdown
     A    B   C
0  1.0  5.0   9
1  2.0  NaN  10
2  NaN  NaN  11
3  4.0  8.0  12

Now, let’s use the isnull() method to identify missing values:

Python
null_df = df.isnull()
print(null_df)

Output:

Markdown
       A      B      C
0  False  False  False
1  False   True  False
2   True   True  False
3  False  False  False

As you can see, True indicates the presence of a missing value.

notnull() Method

The notnull() method is the inverse of isnull(). It is used to detect non-missing values in a DataFrame or Series. It returns a DataFrame or Series of the same shape as the input, where each element is a boolean indicating whether the corresponding element is not missing (True if not missing, False otherwise).

Syntax
Python
DataFrame.notnull()
Series.notnull()
Example

Using the same DataFrame, let’s apply the notnull() method:

Python
notnull_df = df.notnull()
print(notnull_df)

Output:

Markdown
       A      B     C
0   True   True  True
1   True  False  True
2  False  False  True
3   True   True  True

Here, True indicates the presence of a non-missing value.

Practical Use Cases

Filtering Missing Values

You can use isnull() and notnull() to filter rows with missing or non-missing values. For example, to filter rows where column ‘A’ has missing values:

Python
missing_A = df[df['A'].isnull()]
print(missing_A)

Output:

Markdown
    A    B   C
2 NaN  NaN  11

To filter rows where column ‘B’ has non-missing values:

Python
non_missing_B = df[df['B'].notnull()]
print(non_missing_B)

Output:

Markdown
     A    B   C
0  1.0  5.0   9
3  4.0  8.0  12
Counting Missing Values

You can count the number of missing values in each column using the sum() method:

Python
# counting missing values using isnull()
missing_counts = df.isnull().sum()
print(missing_counts)

# counting non-missing values using notnull()
missing_counts = df.notnull().sum()
print(missing_counts)

Output:

Markdown
# counting missing values using isnull()
A    1
B    2
C    0
dtype: int64

# counting non-missing values using notnull()
A    3
B    2
C    4
dtype: int64
Conclusion

The isnull() and notnull() methods in Pandas are powerful tools for detecting missing and non-missing values in your data. Understanding how to use these methods effectively can help you clean and prepare your data for analysis. Whether you need to filter, count, or visualize missing values, isnull() and notnull() provide a solid foundation for handling missing data in your DataFrame or Series.

Explore these methods with your datasets and see how they can simplify your data cleaning process.

Happy coding!

Also Explore:

Leave a Comment