[concept]Grouping & Combining
GroupBy Basics
# theory
split-apply-combine
GroupBy works in three steps:
- Split the data into groups based on a key
- Apply a function to each group
- Combine the results
df.groupby("category")["price"].mean()
# Split by category
# Apply mean() to price column
# Combine into result Series
groupby basics
# Single aggregation
df.groupby("category")["amount"].sum()
df.groupby("category")["amount"].mean()
df.groupby("category")["amount"].count()
df.groupby("category")["amount"].min()
df.groupby("category")["amount"].max()
# Multiple columns
df.groupby("category")[["amount", "quantity"]].sum()
agg() for multi-aggregations
df.groupby("category")["amount"].agg(["sum", "mean", "count"])
This gives you sum, mean, and count for each category in one table.
reset_index after groupby
By default, the group key becomes the index. To get it back as a column:
result = df.groupby("category")["amount"].sum().reset_index()
# Now 'category' is a regular column again
See also
This is the pandas equivalent of SQL's GROUP BY. Same idea, different syntax. Cross-reference on damato-sql at /learn/data-analysis/group-by.
# examples [3]
# example 01 · basic GroupBy
Group and aggregate data
1
2
3
4
5
6
7
8
9
🐍
# example 02 · multiple aggregations
Get several stats at once
1
2
3
4
🐍
# example 03 · reset index
Turn grouped result back into a regular DataFrame
1
2
3
4
5
🐍
# challenges [2]
# challenge 01/02todo
Calculate the total revenue (price * quantity) for sales, then group by category and sum. Print the result.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
# challenge 02/02todo
Group students by grade and find the max score in each grade. Print the result.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
🐍
# project
# project-challenge
thread: Survey Insights Report · reward: 50 xp
# brief
Your recruiter report needs a salary benchmark section. Calculate the average salary for each country to help set competitive compensation packages for international hiring.
# task
Average Salary by Country
# your code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
🐍