Segment Tree — Senior Level¶
Audience: Engineers selecting data structures for production systems, planning capacity for analytics services, or comparing range-query primitives across alternatives. Prerequisites:
junior.md,middle.md.
This document is about engineering: where segment trees actually show up in deployed code, when to pick them over alternatives, how they compare with B-tree augmentation and Fenwick trees in real workloads, and why the iterative form is the right default in competitive programming and most internal systems.
Table of Contents¶
- Why Segment Trees Dominate Flexible Range Queries
- Production Use Cases
- Comparison: Segment Tree vs Fenwick vs B-tree Augmentation
- Merging Segment Trees in Dynamic Connectivity
- Iterative vs Recursive — Engineering Reasoning
- Capacity Planning
1. Why Segment Trees Dominate Flexible Range Queries¶
In competitive programming, when a problem says "range query + range or point update + arbitrary monoid" you reach for a segment tree. Fenwick is faster but only handles invertible monoids (sum, xor). Sparse tables are O(1) per query but immutable. Sqrt decomposition is O(√n) per op — worse than segment trees for n > 10⁴. B-tree augmentation is the right answer in databases but overkill in-memory.
The segment tree wins on expressive power per line of code:
| Need | Segment Tree | Fenwick | Sparse Table | B-tree |
|---|---|---|---|---|
| Range sum, point update | yes (O(log n)) | yes, smaller code | no — static | yes, but heavy |
| Range min, point update | yes | no — min not invertible | yes if static | yes |
| Range gcd, point update | yes | no | yes if static | yes |
| Range update + range query | yes (lazy) | partial (diff trick) | no | yes |
| Persistent / versioned | yes | hard | hard | yes (B-link tree) |
| Range k-th smallest | yes (persistent + merge-sort) | no | no | yes |
| 2D range queries | yes (segment-of-segment, O(log² n)) | yes (BIT 2D) | no | yes (R-tree) |
The trade is: segment trees use 4× the memory of Fenwick and have 2–3× the constant, but they handle non-invertible operations and range updates with lazy propagation — both impossible (or contortionist) in Fenwick.
2. Production Use Cases¶
2.1 Time-series and OLAP hierarchical aggregation¶
Time-series databases (InfluxDB, CrateDB, VictoriaMetrics, TimescaleDB) pre-aggregate at multiple granularities — second, minute, hour, day — and use a hierarchy that is morally a segment tree over the timeline. Querying "avg CPU over the last 13 minutes" reads at most O(log T) pre-aggregated buckets instead of T raw points. The CrateDB internal "hyperloglog over time" index uses segment-tree-style bucket reuse for cardinality estimation.
2.2 Real-time leaderboards¶
A game with 5M players supports two operations: "player X's score changed by Δ" (point update by score bucket) and "how many players have score ≥ S?" (range count). A segment tree keyed by score buckets gives both in O(log score_range). Add lazy propagation for "all players in season X lose 10% rating" (range multiply).
2.3 Computational geometry — sweep lines¶
Segment trees were invented (Bentley 1977) for geometric problems. Modern uses inside CGAL, Boost.Geometry, and CAD software: - Klee's measure (length of union of intervals) — segment tree over coordinate-compressed endpoints. - Skyline problem (LeetCode 218 / mining city outlines) — segment tree with lazy max. - Rectangle union area — sweep line + segment tree (LeetCode 850). - 1D point location in arrangements of n line segments — interval tree, conceptually a segment tree variant.
2.4 Quantitative finance¶
A backtest engine processes a 10-billion-tick history. Strategy code asks "max price in window [t1, t2]" tens of thousands of times per second. A segment tree on the (immutable) tick array beats binary-search-on-prefix-tables because it generalizes to "max in a window weighted by volume" with one combine change. Jane Street has published OCaml libraries (Bonsai?) that use segment trees for time-window aggregation.
2.5 Online judges and contest infrastructure¶
Codeforces, Codechef, AtCoder all expose segment-tree problems weekly. Their judge backends use segment trees internally for ranking ("how many submissions before mine got AC?").
2.6 Databases — tablesample and query planners¶
Postgres EXPLAIN uses a segment-tree-shaped data structure to choose between scan plans based on hierarchical row count estimates. This is not the leaf-level access path — that is a B+ tree — but the planner's statistical histogram is segment-tree-like.
2.7 Anti-cheat and fraud detection¶
A pattern-matching service watches a stream of events and asks "have we seen more than 100 of pattern X in any 1-hour window over the last 24h?" — a sliding range max with point inserts. Segment tree → 80 ns per question. Fenwick can't answer max.
3. Comparison: Segment Tree vs Fenwick vs B-tree Augmentation¶
| Aspect | Segment Tree | Fenwick (BIT) | B-tree augmentation |
|---|---|---|---|
| Memory | 4n words | n words | branching-factor × n / B |
| Build | O(n) | O(n log n) standard, O(n) with the trick | O(n log n) inserts |
| Point update | O(log n) | O(log n), 2× smaller constant | O(log n), but page-cache-aware |
| Range query (sum) | O(log n) | O(log n), 2× faster | O(log n) + page reads |
| Range query (min/max) | O(log n) | not supported (non-invertible) | yes, with augmented summaries |
| Range update | O(log n) with lazy | partial — diff trick for range-add only | yes, with per-page summaries |
| Persistent | yes (path copying) | hard | yes (B-link, copy-on-write) |
| Cache locality | poor with recursion, good with iterative | good (linear array, small constants) | excellent (page-aligned) |
| Concurrency | needs per-path locking | needs per-path locking | row/page locking, MVCC |
| Code size | ~60 lines basic, ~150 with lazy | ~10 lines | thousands |
| Best for | in-memory, generic monoid, contest | sums/xor in-memory | on-disk, transactional |
The decision is usually: - Sums only, in-memory, low-latency: Fenwick. - Min/max/gcd or range updates, in-memory: Segment tree. - Persistent on-disk, transactional: B-tree (typically a B+ tree variant; see 09-08-b-tree).
4. Merging Segment Trees in Dynamic Connectivity¶
In link-cut trees and small-to-large merging on trees, each connected component carries its own segment tree of contained vertices (indexed by some statistic). When components join, the smaller segment tree is merged into the larger in O(small · log n) per merge.
Amortized argument: each element is "absorbed into a larger tree" at most O(log n) times (because its component at least doubles each merge), so the total cost across all merges is O(n log² n). This is the "small-to-large" technique, also called DSU on tree / Sack — covered in the graph and trees chapters.
Applied in: - Heavy-light decomposition on a tree, where each chain holds a segment tree of vertex weights. - Euler tour + segment tree for "subtree sum" and "path sum" — pure segment-tree applications layered on tree traversals. - Dynamic connectivity via Holm-Lichtenberg-Thorup — segment trees per Euler tour for subtree operations.
This is where competitive programmers gain a real edge: segment trees are the glue that connects graph problems with range-query problems.
5. Iterative vs Recursive — Engineering Reasoning¶
The iterative bottom-up form (middle.md §4) is preferred in competitive programming and in most performance-critical systems. Reasons:
5.1 Cache locality¶
Recursive segment tree descends from root (index 1) to a leaf at index ~2n + idx. Each step jumps by a factor of ~2 in array index — meaning the CPU rarely prefetches the next access. Iterative form walks upward, with index i and parent i/2 — both close enough that the cache line containing i often also contains its sibling i ^ 1.
5.2 No recursive frames¶
Each recursive call costs ~50 ns of frame setup in JVM, ~30 ns in Go (due to stack growth checks), ~80 ns in CPython. Iterative form runs the equivalent body in ~5 ns per level.
5.3 Branch prediction¶
The iterative form has a clean while loop with predictable l < r exit. The recursive form has branches inside the function body that get re-evaluated per call.
5.4 When NOT to go iterative¶
- Lazy propagation is significantly cleaner recursive. The push-down step needs to happen before descending; iterative versions must push down both endpoint paths before processing.
- Persistent segment trees are recursive by nature (each call returns a new node).
- 2D segment trees combine an outer recursion with an inner — at least one of the two should be recursive for readability.
A standard production segment tree is: - Iterative, sum or min, no lazy, ~25 lines — hot path in the analytics engine. - Recursive with lazy, ~80 lines — cold path for range updates, called at < 10 kHz.
6. Capacity Planning¶
Numbers a senior engineer should know without looking up:
| Aspect | Value | Note |
|---|---|---|
| Memory per node | 8 bytes (int64) | + 8 for lazy tag if lazy tree |
| Total tree memory | 4n × 8 = 32n bytes (sum tree) | 10⁶ items → 32 MB |
| Build throughput | ~10 ns per element | 10⁶ build in ~10 ms |
| Query latency (iterative, hot cache) | ~50 ns | 10⁵ items L2-resident |
| Query latency (recursive, cold cache) | ~300 ns | 10⁷ items DRAM-resident |
| QPS single-core | ~10–20 M | depends on locality |
| Lazy push-down cost | 2× a normal node visit | branch mispredict on the lazy check |
| Lock contention break-even | none up to ~5 M QPS single tree | shard by index range above that |
For a multi-tenant leaderboard at 100 M QPS, shard the segment tree by score-bucket prefix across N CPU cores (or N machines), keeping each shard ~10 M QPS. Use consistent hashing on the prefix to avoid resharding.
The data structure is small and well-understood; the engineering work is picking it correctly and scaling it horizontally when one core is no longer enough. Continue with professional.md for cache-conscious layouts, SIMD, and the wavelet-tree generalization that subsumes "k-th smallest in range" queries.