python-mastery

# theory

pivot_table

If you've used pivot tables in Excel, this is the same idea but way more powerful.

pd.pivot_table(
    df,
    values="sales",        # what to aggregate
    index="region",        # rows
    columns="product",     # columns
    aggfunc="sum"          # how to aggregate
)

Multiple aggregations:

pd.pivot_table(
    df,
    values="sales",
    index="region",
    aggfunc=["sum", "mean", "count"]
)

groupby + unstack

This is another way to get pivot-style output. Sometimes it's cleaner.

# Group then unstack
df.groupby(["region", "product"])["sales"].sum().unstack()

The result looks exactly like a pivot table. I actually prefer this approach sometimes because it's more explicit about what's happening.

melt (wide to long)

Sometimes data comes in wide format and you need it long. melt fixes that.

# Wide format:
#   name  Jan  Feb  Mar
#   Alice  100  120  130

# Convert to long:
pd.melt(df, id_vars=["name"], var_name="month", value_name="sales")

# Result:
#   name  month  sales
#   Alice   Jan    100
#   Alice   Feb    120
#   Alice   Mar    130

This is super useful when you need to plot data or do time series analysis.

stack / unstack

These are for reshaping multi-index data.

# unstack; move inner index to columns
grouped = df.groupby(["region", "year"])["sales"].sum()
grouped.unstack()  # years become columns

# stack; move columns to index (opposite of unstack)
pivoted.stack()

fill_value for missing combos

When pivoting, some combinations might not exist in your data. You get NaN.

pd.pivot_table(
    df,
    values="sales",
    index="region",
    columns="product",
    fill_value=0  # fill missing with 0 instead of NaN
)

margins (totals)

pd.pivot_table(
    df,
    values="sales",
    index="region",
    columns="product",
    aggfunc="sum",
    margins=True,      # add row/column totals
    margins_name="Total"
)

Pivot tables with margins are useful for reports. You get row and column totals in one call.

# examples [3]

# example 01 · basic pivot table

Summarize sales by region and product

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

🐍

Loading PythonSetting up pandas & numpy...

# example 02 · using melt to reshape

Convert wide data to long format; happens a lot with imported spreadsheets

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

🐍

Loading PythonSetting up pandas & numpy...

# example 03 · pivot with margins (totals)

Add row and column totals to your pivot table

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

🐍

Loading PythonSetting up pandas & numpy...

# challenges [2]

# challenge 01/02todo

Create a pivot table from sales showing the sum of 'price' for each 'category'. Print the result.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

🐍

Loading PythonSetting up pandas & numpy...

# challenge 02/02todo

Use groupby and unstack to show total quantity sold by category. This gives the same result as a pivot table but using a different approach.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

# Using the sales data
print("Sales data:")
print(sales)

# Simple pivot: total quantity by category
pivot = pd.pivot_table(
    sales,
    values="quantity",
    index="category",
    aggfunc="sum"
)
print("\nQuantity by category:")
print(pivot)


# Use groupby and unstack to show total quantity sold by category. This gives the same result as a pivot table but using a different approach.
# Your code here:

🐍

Loading PythonSetting up pandas & numpy...

Pivot Tables & Reshaping

pivot_table

groupby + unstack

melt (wide to long)

stack / unstack

fill_value for missing combos

margins (totals)