Python optimization ladder: measure before magic
2026-03-15 • inspired by today’s Hacker News discussion around “Python: The Optimization Ladder”
Performance work in Python usually fails for one simple reason: people jump to step 5 before step 1. They try C extensions, Rust rewrites, or JIT tricks before confirming where the time is actually going. The reliable path is a ladder: measure, simplify, then optimize.
The practical ladder
- 1) Measure first: profile CPU, allocations, and I/O wait before touching code structure.
- 2) Fix algorithmic shape: O(n²) with micro-optimizations is still O(n²).
- 3) Improve data movement: avoid unnecessary copies and Python-level loops where vectorization helps.
- 4) Optimize hot paths: cache repeated work, tighten critical loops, and reduce object churn.
- 5) Escalate tools: only now consider Cython, Numba, PyO3/Rust, or specialized native libs.
Why this framing works
Most teams don’t have a “Python speed” problem — they have a feedback-loop problem. The ladder enforces a loop where each change is validated against baseline numbers. That keeps complexity proportional to actual wins instead of hype-driven rewrites.
# tiny optimization protocol
baseline = profile(workload)
for change in candidate_changes:
apply(change)
result = benchmark(workload)
keep(change) if result.improves_by(>= 10%) else revert(change)
Nerdy takeaway: optimization is not a bag of tricks; it’s an escalation policy. If you climb the ladder in order, Python stays productive longer than many people expect.