Union by Rank — Mathematical Foundations and Complexity Theory¶
Table of Contents¶
- Formal Definition
- Correctness — Partition Invariant and the Rank-Height Bound
- The
O(log n)Height Bound for Union-by-Rank Alone - The
O(α(n))Combined Bound (Tarjan & van Leeuwen) - The Tight Lower Bound
Ω(α(n)) - Cache & Memory Layout
- Average-Case / Randomized Linking
- Space–Time Trade-offs: rank byte vs size int
- Comparison with Alternatives
- Open Problems
- Summary
1. Formal Definition¶
Let U = {0, 1, …, n−1} be a universe of elements. A disjoint-set forest maintains a partition of U into blocks, represented by a parent function parent : U → U. Each block is a rooted tree; x is a root iff parent(x) = x. Define
Definition 1.1 (Operations). - makeset(x): parent(x) := x. - find(x): return the root of x's tree. - union(x, y): let rx = find(x), ry = find(y); if rx ≠ ry, set parent of one root to the other (a link).
Definition 1.2 (Rank). Maintain rank : U → ℕ, with rank(x) := 0 at makeset. The union-by-rank link rule for distinct roots rx, ry:
if rank(rx) < rank(ry): parent(rx) := ry
elif rank(rx) > rank(ry): parent(ry) := rx
else: parent(ry) := rx; rank(rx) := rank(rx) + 1
Definition 1.3 (Size). Maintain size : U → ℕ₊, size(x) := 1 at makeset. The union-by-size link rule for distinct roots, with size(rx) ≥ size(ry) (swap otherwise):
Definition 1.4 (Path compression). During find(x), after locating the root r, set parent(z) := r for every z on the path from x to r. (Variants: path halving / path splitting set each parent(z) := parent(parent(z)).)
The standard optimized DSU uses Definition 1.2 (or 1.3) together with Definition 1.4.
Convention on rank under compression. Ranks are never decreased. After a compression, rank(x) may strictly exceed the true height of the subtree at x. We henceforth treat rank as an abstract label satisfying the invariants of §2, not as the exact height.
2. Correctness — Partition Invariant and the Rank-Height Bound¶
2.1 Partition invariant¶
Invariant P. At all times the relation x ≡ y ⟺ find(x) = find(y) is an equivalence relation whose classes are exactly the maintained blocks.
Proof. makeset creates singleton classes (reflexive). find follows a functional parent chain in a finite forest with self-loops only at roots, so it terminates at a unique root; thus ≡ is well-defined and an equivalence relation (reflexive, symmetric, transitive via the shared root). union(x,y) links two roots, merging exactly the two classes of x and y and leaving all others untouched; a no-op when find(x)=find(y). Path compression re-points nodes to their existing root, so find values are unchanged and P is preserved. ∎
2.2 Rank monotonicity invariants¶
The following are maintained by union-by-rank with or without path compression. Let t index time.
Invariant R1 (root ranks only rise). For every element x, rank(x) is non-decreasing over time, and once x becomes a non-root it never becomes a root again, after which rank(x) is frozen.
Proof. rank is written only by the tie case of Def. 1.2, which increments. A node becomes a non-root only when linked under another root; the forest only ever adds links between roots, so a non-root stays a non-root, and its rank is never touched again. ∎
Invariant R2 (strict increase along parent edges). If parent(x) ≠ x then rank(parent(x)) > rank(x), and this strict inequality persists.
Proof. When x is linked under y it is because rank(y) ≥ rank(x); in the equal case rank(y) is then incremented, giving rank(y) > rank(x) at the moment of linking. By R1, rank(x) is frozen thereafter while rank(parent-side ancestors) can only rise, so the inequality persists. Path compression sets parent(x) to an ancestor of the old parent, which by transitivity of R2 has strictly larger rank than x; hence the inequality is preserved (indeed strengthened). ∎
Corollary 2.3 (Rank upper-bounds height). Along any root-to-leaf path, ranks strictly decrease by R2. Hence the height of a tree is at most the rank of its root. (Without compression this is tight up to the bound of §3; with compression heights can be far smaller, but the inequality height ≤ rank(root) still holds.) ∎
This corollary is why we never repair ranks: every analysis below needs only R1, R2, and the counting bound of §3 — all of which survive compression untouched.
3. The O(log n) Height Bound for Union-by-Rank Alone¶
We bound rank, hence height, in the compression-free forest.
Lemma 3.1 (Subtree-size lower bound). In a union-by-rank forest without path compression, any node x with rank(x) = r is the root of a subtree containing at least 2^r nodes.
Proof (induction on time / on r). Base. At makeset, rank = 0 and the subtree has 1 = 2^0 node. Step. rank(x) becomes r only via the tie case: two roots x and y, both of rank r−1, are linked (y under x), and rank(x) is set to r. By the inductive hypothesis each subtree had ≥ 2^{r−1} nodes, so the merged subtree at x has ≥ 2^{r−1} + 2^{r−1} = 2^r nodes. A non-tie link does not change rank(x) and only adds nodes beneath x, preserving the bound. ∎
Theorem 3.2 (Maximum rank). In a forest of n elements built by union-by-rank, every rank satisfies r ≤ ⌊log₂ n⌋.
Proof. By Lemma 3.1 a node of rank r commands ≥ 2^r distinct nodes, all in U, so 2^r ≤ n, giving r ≤ ⌊log₂ n⌋. ∎
Corollary 3.3 (Operation cost). Without compression, every tree has height ≤ ⌊log₂ n⌋ (Cor. 2.3 + Thm 3.2), so find and therefore union run in O(log n) worst case. The symmetric statement holds for union-by-size, replacing "rank r ⇒ ≥ 2^r nodes" with "a node at depth d lies in a tree whose size at least doubled d times," again giving height ≤ ⌊log₂ n⌋. ∎
Remark 3.4. Lemma 3.1's exact form (size ≥ 2^rank) holds only without compression. Compression can shrink a subtree while leaving the root's rank fixed, so afterward size ≥ 2^rank may fail. What survives is the weaker, sufficient fact used everywhere downstream: ranks are bounded by ⌊log₂ n⌋ and strictly increase along parent edges (R2, Thm 3.2). These two facts are time-stable because Thm 3.2's argument can be re-run at the moment each rank value is created (a tie among compression-free-or-not roots still merges two rank-(r−1) roots, each of which — by R1 — acquired rank r−1 at a time when it dominated ≥ 2^{r−1} distinct elements that were assigned to it and never reassigned). The standard treatment (CLRS Lemma 21.x) formalizes this as: at most ⌊n / 2^r⌋ nodes ever attain rank r.
Lemma 3.5 (Rank-class population, CLRS). For each r ≥ 0, at most ⌊n / 2^r⌋ elements ever have rank exactly r, including under path compression.
Proof idea. When an element first reaches rank r, "charge" to it the ≥ 2^r elements that were in its subtree at that instant (Lemma 3.1 applied at creation time, before any later compression). By R1 an element's rank, once frozen as a non-root, never changes, and distinct rank-r creations charge disjoint element sets. Hence the number of rank-r elements times 2^r is at most n. ∎
Lemma 3.5 is the bridge that lets the O(α(n)) analysis of §4 tolerate compression.
4. The O(α(n)) Combined Bound (Tarjan & van Leeuwen)¶
4.1 The inverse Ackermann function¶
Define the Ackermann-style hierarchy (one common normalization):
A_k grows explosively in k: A_1(j) ≈ 2j, A_2(j) ≈ 2^{j}·(j+1), A_3 is tower-of-twos territory, A_4 is beyond astronomical. The inverse Ackermann function is
Because A_4(1) already exceeds 2^{2^{2^{2^{16}}}}, we have α(n) ≤ 4 for every n ≤ 2^{2^{2^{2^{16}}}} — i.e. for any n that could be represented in the observable universe. Thus α(n) is, for engineering purposes, a constant ≤ 4.
4.2 The theorem¶
Theorem 4.1 (Tarjan 1975; Tarjan & van Leeuwen 1984). A sequence of m find/union operations on n elements, using union by rank (or size) together with path compression (or path halving / splitting), runs in total time
i.e. O(α(n)) amortized per operation. The same holds for path halving and path splitting; it holds for "union by rank" and "union by size," and for any linking-by-index rule that keeps ranks satisfying R1,R2,Lemma 3.5.
4.3 Proof structure: the rank / level / iter potential¶
We sketch the canonical amortized argument (CLRS / Tarjan), which is the part to internalize.
Setup. Fix the final maximum rank; by Thm 3.2 / Lemma 3.5 all ranks are in [0, ⌊log₂ n⌋]. Each find(x) walks x → parent(x) → … → root, then compresses. We pay O(1) for each node on the path; total work is O(path length) summed over all finds. The whole game is bounding Σ path length.
Two functions on non-root nodes (with p(x) := parent(x)), defined using the frozen ranks (legitimate by R1):
level(x)= the largest integerksuch thatrank(p(x)) ≥ A_k(rank(x)). It measures "how explosively bigger the parent's rank is."level(x) ∈ {0, 1, …, α(n)}, because oncekreachesα(n)we haveA_k(rank(x)) ≥ A_{α}(1) ≥ n > rank(p(x))is impossible, cappinglevel.iter(x)= the largestisuch that applyingA_{level(x)}itimes torank(x)still does not exceedrank(p(x)). It refineslevel.iter(x) ∈ {1, …, rank(x)}.
Potential function. Assign each non-root node x the potential
Roots and rank-0 nodes get a fixed Φ = α(n)·rank(x). One verifies 0 ≤ Φ(x) ≤ α(n)·rank(x).
Key monotonicity lemma. Whenever a node x is not the root and not a child of the root, and a find passes through it and compresses it, p(x) is replaced by a strictly higher-rank ancestor. This either (a) strictly increases level(x), or (b) keeps level(x) fixed but strictly increases iter(x). Either way Φ(x) strictly decreases by at least 1. (This is the heart of the proof: each compressed node "pays for itself" out of its own potential.)
Charging / bucketing argument. Consider one find of path length L. Mark the nodes on the path: - The root and the child of the root: at most 2 nodes — charge to the operation, O(1) each. - For the remaining L − 2 interior nodes, bucket them by level(x) ∈ {0,…,α(n)}. Within each level bucket, at most one node per distinct rank value can be charged to the operation directly; all the others have the property above and hence each pays one unit of its own Φ, which strictly drops. Since there are α(n)+1 levels and ranks lie in [0, log n], the number of nodes charged to the operation is O(α(n)); every other node on the path is charged to its own potential decrease.
Accounting. Total cost = Σ_ops O(α(n)) (direct charges) + Σ_nodes (total decrease in Φ over all time). The initial total potential is O(n · α(n) · log n)... and here the bucketing by level is exactly what tightens this to O((m+n)·α(n)): the careful version (CLRS Thm 21.14) shows the amortized cost per operation is O(α(n)), not O(α(n) log n), because Φ decreases are bounded by the number of level/iter refinements, of which there are O(α(n)) per node across its lifetime. Summing gives
Reader's takeaway. The proof needs three time-stable facts about ranks: they start at 0, strictly increase along parent edges (R2), and obey the population bound (Lemma 3.5). All three survive path compression because we never decrease ranks — which is precisely the justification for the "don't fix ranks" rule stated informally at junior/middle level.
5. The Tight Lower Bound Ω(α(n))¶
The O(α(n)) bound is not an artifact of a weak analysis — it is essentially optimal.
Theorem 5.1 (Tarjan 1979; Cell-probe / pointer-machine lower bound). In the separable pointer machine model, any algorithm for the disjoint-set-union problem requires Ω(m · α(m, n)) time on some sequence of m operations over n elements. Hence amortized Ω(α(n)) per operation; no pointer-machine DSU can be amortized O(1).
Theorem 5.2 (Fredman & Saks 1989; cell-probe model). In the cell-probe model with words of O(log n) bits, the union-find problem (and the closely related list-splitting / partial-sums problems) requires Ω(α(n)) amortized time per operation. This strengthens Theorem 5.1 to a model that permits arbitrary computation on machine words, ruling out clever bit-tricks as a route to O(1).
Consequences. - The Tarjan–van Leeuwen O(α(n)) upper bound is tight to within constant factors in both the pointer-machine and cell-probe models. - O(1) amortized union-find is impossible in these standard models; the inverse Ackermann factor is intrinsic to the problem, not to the algorithm. - (Caveat) In restricted settings — e.g. the incremental / interval union-find of Gabow & Tarjan (1985), where the union structure is known in advance to be a tree of intervals — O(1) is achievable, because the lower bound's adversary cannot be realized. The general bound stands.
6. Cache & Memory Layout¶
DSU is dominated by random access into parent[], so memory behavior matters as much as the α(n) factor.
- Array-of-structs vs struct-of-arrays. Keeping
parent[],rank[]/size[]as separate arrays (SoA) is usually better:findtouches onlyparent[]on the hot path, so it streams one array;uniontouches the balancing array only at the two roots. Interleaving{parent, rank}per element (AoS) wastes a cache line during compression-heavyfind. - Compression improves locality over time. Path halving/splitting shorten chains, so later
finds touch fewer, hotter cache lines. This is a practical reason halving often beats full compression: it compresses incrementally with one pass and no second walk. - Rank as a byte.
rank ≤ ⌊log₂ n⌋ ≤ 64, so auint8rank[]is exact for any conceivablen, packing 64 ranks per cache line — negligible footprint and great locality. - False sharing in concurrent DSU. Two cores updating ranks/sizes of roots that share a cache line contend. Pad or shard hot roots; or use union-by-random (§7) to avoid mutable balancing fields entirely.
7. Average-Case / Randomized Linking¶
Union by random index ("union by rank-less linking"). Assign each element an independent uniform priority π(x) ∈ [0,1) at makeset; on union, the root with the larger π becomes the parent. No mutable rank/size is needed.
Theorem 7.1. With random-priority linking (no compression), the expected height of any tree on k nodes is O(log k); more precisely the structure is distributed like a random binary-search-tree / treap, giving expected depth ≈ 2 ln k and O(log k) with high probability.
Why it matters. Random linking removes the read-modify-write on rank/size during union, which is the chief source of CAS contention in lock-free DSU (§ senior). The cost is that the height bound is expected, not worst-case, and you lose free component sizes. Combined with path compression, randomized linking also achieves O(α(n)) expected amortized (Goel, Khanna, Larkin, Tarjan, 2014, analyze randomized linking + splitting and show near-optimal bounds).
Average-case under random unions. If the union sequence itself is random (uniform pair each step), even naive linking yields O(log n) expected height; balancing's worst-case guarantee is what protects against adversarial sequences, which is what the Ω(α(n)) lower bound exploits.
8. Space–Time Trade-offs: rank byte vs size int¶
| Field | Width that suffices | Bytes for n=10⁹ | Extra capability |
|---|---|---|---|
parent | int32 (n < 2³¹) | 4 GB | required |
rank | uint8 (max ~64) | 1 GB | none beyond balancing |
size | int32 (n < 2³¹) or int64 | 4–8 GB | component cardinality, "largest component," percolation, weighted merges |
- Union by rank total ≈
5nbytes; union by size ≈8n(or12nwith 64-bit sizes). - Both yield identical asymptotics (
O(α(n))): the choice is purely memory-vs-features. - Overflow note.
ranknever overflows auint8.sizesums can overflowint32oncen > 2³¹; useint64for huge universes. - Time is identical to first order — the balancing array is read at two roots per union and written at one; neither dominates the
findcost.
Rule: prefer size unless you are memory-bound, since the marginal 3n bytes buy a metric you frequently need anyway.
9. Comparison with Alternatives¶
| Scheme | find worst case | find amortized | union | Extra space | Bound type |
|---|---|---|---|---|---|
| Naive link, no compress | O(n) | O(n) | O(n) | 0 | worst |
| Union by rank/size, no compress | O(log n) | O(log n) | O(log n) | n (byte/int) | worst |
| Path compression, no balancing | O(n) | O(log n) | O(log n) am. | 0 | amortized |
| Rank/size + compression | O(log n) | O(α(n)) | O(α(n)) am. | n | amortized, optimal |
| Random link + splitting | O(log n) exp. | O(α(n)) exp. | O(α(n)) exp. | 0 | expected, optimal |
| Interval union-find (Gabow–Tarjan) | — | O(1) | O(1) | O(n) | amortized (restricted) |
Path compression without balancing already gives amortized O(log n) (Tarjan & van Leeuwen show Θ(log n) amortized for compression-only); it is the combination with rank/size that drops the exponent to α(n).
10. Open Problems¶
- Exact constants. The precise constant in front of
α(n)differs between full compression, path halving, and path splitting; pinning down the optimal linking+compression pairing's constant factor is still refined in the literature (Patwary, Blelloch, et al., experimental; Goel–Khanna–Larkin–Tarjan, theoretical). - Persistence with optimal bounds. Fully-persistent or confluently-persistent union-find with
O(α(n))per operation and small space is not fully resolved; the rollback variant givesO(log n)without compression. - Deletions. Union-find with element/edge deletions (true dynamic connectivity with deletions) cannot use plain DSU; the best general structures are
O(log² n)-ish (Holm–de Lichtenberg–Thorup). Whether DSU-likeα(n)bounds are reachable with deletions in restricted models is open. - Parallel / concurrent optimality. The exact span/work trade-off for wait-free concurrent union-find matching the sequential
α(n)work bound is an active area (Jayanti–Tarjan–Boix-Adserà give wait-free analyses; optimality questions remain). - Lower bounds in richer models. Whether
Ω(α(n))is unavoidable in models stronger than cell-probe (with structured parallelism or nonstandard word operations) is not fully settled.
11. Summary¶
- Rank is a non-decreasing label satisfying: starts at 0; strictly increases along parent edges (
R2); at most⌊n/2^r⌋elements ever reach rankr(Lemma 3.5). These are the only facts the complexity proofs use, and all three survive path compression — which is exactly why ranks are never decremented even though they over-estimate height after compression. - Alone, union by rank/size bounds height by
⌊log₂ n⌋(Lemma 3.1: a rank-rroot commands≥ 2^rnodes), giving worst-caseO(log n). - Combined with path compression, the Tarjan–van Leeuwen theorem gives amortized
O(α(n))per operation, proved by the rank/level/iter potential with a level-bucketing charging argument — effectively constant sinceα(n) ≤ 4for any physicaln. - The bound is tight: Tarjan (1979, pointer machine) and Fredman–Saks (1989, cell-probe) prove
Ω(α(n))amortized, soO(1)union-find is impossible in standard models. - Rank vs size are asymptotically identical; size costs ~
3nmore bytes but yields component cardinalities and merges cleanly in distributed settings. Random linking trades worst-case for expected bounds and removes mutable balancing fields, easing concurrency.
Key references: Tarjan (1975) "Efficiency of a good but not linear set union algorithm"; Tarjan & van Leeuwen (1984) "Worst-case analysis of set union algorithms"; Tarjan (1979) lower bound; Fredman & Saks (1989) "The cell probe complexity of dynamic data structures"; CLRS Chapter 21; Goel, Khanna, Larkin & Tarjan (2014) on randomized linking.