[concept]String & File Ops
File I/O
# theory
reading
# Read entire file
with open("file.txt", "r") as f:
content = f.read()
# Read lines into list
with open("file.txt", "r") as f:
lines = f.readlines()
# Read line by line (memory efficient)
with open("file.txt", "r") as f:
for line in f:
print(line.strip())
writing
# Write (overwrites existing)
with open("output.txt", "w") as f:
f.write("Hello, World!\n")
# Append to existing
with open("output.txt", "a") as f:
f.write("Another line\n")
# Write multiple lines
lines = ["Line 1", "Line 2", "Line 3"]
with open("output.txt", "w") as f:
f.writelines(line + "\n" for line in lines)
the with statement
Always use with to ensure files are properly closed:
# Good - file automatically closes
with open("file.txt") as f:
data = f.read()
# Bad - you might forget to close
f = open("file.txt")
data = f.read()
f.close() # Easy to forget!
file modes
| Mode | Meaning |
|---|---|
| "r" | Read (default) |
| "w" | Write (overwrites) |
| "a" | Append |
| "x" | Create (fails if exists) |
| "b" | Binary mode |
| "r+" | Read and write |
the csv module
import csv
# Read CSV
with open("data.csv") as f:
reader = csv.reader(f)
for row in reader:
print(row) # row is a list
# Read as dictionaries
with open("data.csv") as f:
reader = csv.DictReader(f)
for row in reader:
print(row["column_name"])
# Write CSV
with open("output.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerow(["Name", "Age"])
writer.writerow(["Alice", 25])
loading real-size data
The browser can't read /Users/you/data.csv. But it can pyfetch a URL and feed the bytes to pd.read_csv or csv.DictReader through io.StringIO. That's how the examples below switch from 8-row toy CSVs to real datasets in the hundreds-to-thousands of rows.
import io
import pandas as pd
from pyodide.http import pyfetch
URL = "https://raw.githubusercontent.com/plotly/datasets/master/diabetes.csv"
resp = await pyfetch(URL)
df = pd.read_csv(io.StringIO(await resp.string()))
print(df.shape) # (768, 9)
Any raw.githubusercontent.com URL is CORS-friendly. Plotly and Vega both maintain large dataset repos that work without auth.
# examples [4]
Process each line individually
Write text to a file-like object
Read and parse CSV data
Fetch a 768-row public dataset and load it straight into pandas. Same code you'd write against a local file, swap open() for pyfetch + StringIO.
# challenges [2]
# project
# project-challenge
thread: Sales Performance Dashboard · reward: 50 xp
# brief
Management wants a text summary of sales by region. Generate a report showing total revenue per region and write it to a file-like object for export.
# task