[concept]Pandas Fundamentals
DataFrames & Series
# theory
DataFrames
A DataFrame is a 2D table with rows and columns, like an Excel spreadsheet or SQL table. It's the core data structure in pandas.
import pandas as pd
# Create from dictionary
data = {
"name": ["Alice", "Bob", "Carol"],
"age": [25, 30, 28],
"city": ["NYC", "LA", "Chicago"]
}
df = pd.DataFrame(data)
Series
A Series is a single column; a 1D array with labels (the index):
ages = df["age"] # This is a Series
print(type(ages)) # <class 'pandas.core.series.Series'>
exploring
df.head() # First 5 rows
df.head(10) # First 10 rows
df.tail(3) # Last 3 rows
df.info() # Column types, non-null counts
df.describe() # Statistics for numeric columns
df.shape # (rows, columns) tuple
df.columns # Column names
df.dtypes # Data type of each column
data types
| dtype | Meaning |
|---|---|
| int64 | Integer |
| float64 | Decimal number |
| object | String (usually) |
| bool | True/False |
| datetime64 | Date/time |
# examples [3]
# example 01 · creating a DataFrame
Build a DataFrame from a dictionary
1
2
3
4
5
6
7
8
9
10
🐍
# example 02 · DataFrame vs Series
Understanding the difference between 1D and 2D structures
1
2
3
4
5
6
7
8
🐍
# example 03 · describe statistics
Get quick statistics for numeric columns
1
2
3
4
5
🐍
# challenges [2]
# challenge 01/02todo
Print the shape of the 'sales' DataFrame and then print all column names.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
🐍
# challenge 02/02todo
Print the last 3 rows of the students DataFrame using tail().
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
🐍
# project
# project-challenge
thread: SF Permits Analysis · reward: 50 xp
# brief
You just received the SF building permits dataset. Before any analysis, explore its structure to understand what you're working with.
# task
Explore Permits Data
# your code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
🐍