Are you using pandas with a dataframe, possibly with a huge amount of data, and you want to filter a dataframe for rows where the column value is equal to something? If so, you can learn how to do this with pandas in this article.
Assume we have the following dataframe:
employees_df = pd.DataFrame({'salary': [20000, 50000, 100000, 30000, 70000, 50000, 70000, 45000], 'name': ['John', 'Linda', 'Sam', 'Albert', 'Francis', 'Tara', 'Susan', 'Wayne'], 'age': [23, 35, 27, 44, 52, 25, 32, 35]})
# Outputting the dataframe
employees_df
salary name age 0 20000 John 23 1 50000 Linda 35 2 100000 Sam 27 3 30000 Albert 44 4 70000 Francis 52 5 50000 Tara 25 6 70000 Susan 32 7 45000 Wayne 35
Let’s say we want to find all the rows in employees_df
where the salary is 70000.
We can do that with the following command:
employees_df[employees_df['salary'] == 70000]
# below is the output we would get
salary name age 4 70000 Francis 52 6 70000 Susan 32
We could do a variation of the above with:
employees_df[employees_df.salary == 70000]
And we would have gotten the same output. Using either dot notation or bracket notation will work for filtering.
Now, let’s say that we want to find all the rows in employees_df where age is 35. We can find that with:
employees_df[employees_df['age'] == 35]
We’d get the following output:
salary name age 1 50000 Linda 35 7 45000 Wayne 35
We could have also done
employees_df[employees_df.age == 35]
and gotten the same output.
Basic Pattern For Filtering A Dataframe For A Specific Column Being Equal To Some Value
Let’s call a general dataframe df. The basic pattern for filtering df for some column’s value being equal to something would be:
df[df['column_name'] == some_value]
Or
df[df.column_name == some_value]
Do you have anything to add? Let’s discuss it in the comments below.