Pandas drop rows vs filter with examples

In the Python pandas library, the drop function is used to remove rows or columns from a pandas DataFrame, while the filter function is used to subset rows based on a given condition.

Pandas is a powerful library in Python for data manipulation and analysis. It provides functions and methods to perform a wide range of operations on DataFrames, including dropping rows or columns and filtering rows based on a condition.

In this blog, we will explore the difference between the drop and filter functions in pandas, and provide examples of how to use them.

Dropping Rows or Columns with the drop Function

The drop function in pandas can be used to remove rows or columns from a DataFrame. It takes a list of labels (either row labels or column labels) as the first argument, and the axis parameter specifies whether to drop rows (axis=0) or columns (axis=1).

Here is an example of using the drop function to remove rows from a DataFrame:

import pandas as pd
 
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8], 'C': [9, 10, 11, 12]})
 
# Print the DataFrame
print(df)
 
#    A  B   C
# 0  1  5   9
# 1  2  6  10
# 2  3  7  11
# 3  4  8  12
 
# Drop rows with index 1 and 3
df = df.drop([1, 3])
 
# Print the updated DataFrame
print(df)
 
#    A  B   C
# 0  1  5   9
# 2  3  7  11
 

We can also use the drop function to remove columns from a DataFrame. In this case, we need to set the axis parameter to 1.

import pandas as pd
 
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8], 'C': [9, 10, 11, 12]})
 
# Print the DataFrame
print(df)
 
#    A  B   C
# 0  1  5   9
# 1  2  6  10
# 2  3  7  11
# 3  4  8  12
 
# Drop column B
df = df.drop(['B'], axis=1)
 
# Print the updated DataFrame
print(df)
 
#    A   C
# 0  1   9
# 1  2  10
# 2  3  11
# 3  4  12
 

Filtering Rows with the filter Function

The filter function in pandas can be used to subset rows based on a given condition. It takes a boolean mask as the first argument, which specifies which rows to keep (True) and which rows to drop (False).

Here is an example of using the filter function to subset rows based on a condition:

import pandas as pd
 
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8], 'C': [9, 10, 11, 12]})
 
# Print the DataFrame
print(df)
 
#    A  B   C
# 0  1  5   9
# 1  2  6  10
# 2  3  7  11
# 3  4  8  12
 
# Filter rows where column A is greater than 2
df = df.filter(df['A'] > 2)
 
# Print the updated DataFrame
print(df)
 
#    A  B   C
# 2  3  7  11
# 3  4  8  12