Skip to content

Tarjan's SCC — Mathematical Foundations and Complexity Theory

Table of Contents

  1. Formal Definition
  2. Correctness Proof — the Lowlink Invariant and Root-Pop Correctness
  3. Complexity O(V + E)
  4. Worked Trace — DFS Tree with disc/low
  5. Comparison with Kosaraju and Gabow (constants, passes)
  6. Parallel SCC Complexity
  7. Cache Behavior
  8. Average-Case and Random-Graph Analysis
  9. Space–Time Trade-offs
  10. Comparison with Alternatives (asymptotics + constants)
  11. Reference Implementations — Gabow, Iterative Tarjan, Condensation
  12. Condensation-DAG Properties
  13. Open Problems (parallel and dynamic SCC)
  14. Summary

1. Formal Definition

Let G = (V, E) be a finite directed graph with n = |V| and m = |E|.

Definition 1.1 (Reachability). Write u ⇝ v if there is a directed path (possibly of length 0) from u to v. Reachability is reflexive and transitive — a preorder on V.

Definition 1.2 (Mutual reachability). Define u ≈ v iff u ⇝ v and v ⇝ u.

Proposition 1.3 ( is an equivalence relation). is reflexive (u ⇝ u by the length-0 path), symmetric (by definition), and transitive (if u ≈ v and v ≈ w, concatenating paths gives u ⇝ w and w ⇝ u). ∎

Definition 1.4 (Strongly Connected Component). An SCC is an equivalence class of . The SCCs partition V uniquely. The SCC of v is [v] = { u ∈ V : u ≈ v }.

Definition 1.5 (Condensation). The condensation G^{SCC} = (V^{SCC}, E^{SCC}) has one node per SCC, with an edge [u] → [w] iff [u] ≠ [w] and some edge x → y ∈ E has x ∈ [u], y ∈ [w].

Proposition 1.6 (Condensation is a DAG). Suppose G^{SCC} had a directed cycle C_1 → C_2 → ... → C_k → C_1 with k ≥ 2 distinct SCCs. Then every C_i is reachable from every other along the cycle, so for any x ∈ C_1, y ∈ C_2 we get x ⇝ y ⇝ x, i.e. x ≈ y, forcing C_1 = C_2 — contradiction. ∎

Definition 1.7 (DFS classification). Run DFS on G. Assign disc[v] = the time v is first discovered (a strictly increasing counter). For a non-tree edge u → v at the moment it is examined: - it is a back edge if v is an ancestor of u in the DFS forest (v is on the recursion path), - a forward edge if v is a proper descendant already finished, - a cross edge otherwise (v finished, not an ancestor/descendant).

In Tarjan, the only distinction that matters is onStack[v]: a vertex is on the explicit stack iff it has been discovered and its SCC has not yet been output.


Definition 2.1 (Lowlink). For vertex v, let T(v) be the set of vertices in the DFS subtree rooted at v. Define

low[v] = min ( { disc[v] }
             ∪ { disc[w] : ∃ x ∈ T(v) and a non-tree edge x → w
                           with w on the stack at the time x → w is examined } ).

Equivalently, low[v] is the smallest discovery index reachable from v using zero or more tree edges within T(v) followed by at most one non-tree edge that targets an on-stack vertex. The algorithm computes exactly this value via the two update rules.

2.2 The update rules compute low correctly

The algorithm initializes low[v] = disc[v] and applies, while scanning edges v → w:

  • (R1) tree edge (w unvisited): recurse on w, then low[v] ← min(low[v], low[w]).
  • (R2) edge to on-stack w: low[v] ← min(low[v], disc[w]).
  • (R3) edge to off-stack w (finished SCC): no update.

Lemma 2.2. After v finishes, the computed low[v] equals the value in Definition 2.1.

Proof. The minimum in Definition 2.1 ranges over x ∈ T(v). Decompose by where x lies: - x = v itself: its qualifying non-tree edges v → w with on-stack w contribute disc[w], captured exactly by (R2). The base value disc[v] is the initialization. - x in the subtree of a child c of v (so x ∈ T(c)): by the inductive hypothesis on c (which finishes before v because DFS is post-order on tree edges), low[c] already equals the Definition-2.1 minimum over T(c). Rule (R1) folds low[c] into low[v].

One subtlety justifies (R3): if v → w targets an off-stack w, then w's SCC was already popped, so w is in a different SCC that is a strict predecessor of [v] in the condensation order; w is not an ancestor on the current path and cannot be in T(v)'s mutual-reachability closure with v. Including disc[w] would not correspond to any vertex reachable-back-to within open components, so it is correctly ignored. Taking the minimum over all three cases yields exactly Definition 2.1. ∎

2.3 Roots and the pop test

Definition 2.3 (Root). The root of an SCC C is the vertex r ∈ C with the smallest disc value (the first member of C discovered).

Lemma 2.4 (Subtree containment). Let r be the root of SCC C. Then every u ∈ C satisfies u ∈ T(r); that is, the whole component lies in r's DFS subtree, and disc[r] ≤ disc[u].

Proof. Take u ∈ C. Since r ≈ u, there is a path r ⇝ u and a path u ⇝ r. At the time r is discovered, every vertex of C is still unvisited (because r has the minimum disc in C). By the white-path theorem for DFS, since there is a path from r to u consisting entirely of currently-white (unvisited) vertices — such a path exists inside C because all of C is white when r is discovered — u becomes a descendant of r in the DFS forest, i.e. u ∈ T(r). Hence disc[r] ≤ disc[u]. ∎

Theorem 2.5 (Root characterization). When DFS finishes vertex u, low[u] == disc[u] iff u is the root of its SCC.

Proof.

(⇐) Suppose u is the root r of C. We show low[r] ≥ disc[r] (the reverse is automatic since low only ever decreases from disc). Take any x ∈ T(r) with a qualifying non-tree edge x → w to an on-stack w, contributing disc[w] to low[r]. We claim disc[w] ≥ disc[r]. Since w is on the stack, w's SCC C' has not been popped; w is reachable from r (through x ∈ T(r)). Suppose for contradiction disc[w] < disc[r]. Then w ∉ C (as r is the minimum-disc member of C), and disc[w] < disc[r] means w was discovered before r. Because w is still on the stack and w ⇝ r would close mutual reachability... more carefully: w on the stack with disc[w] < disc[r] implies w is a proper ancestor of r on the DFS path (the stack, ordered by disc, is exactly the ancestor chain plus open siblings; an on-stack vertex reachable from T(r) with smaller disc must be an ancestor of r). An ancestor w reachable from r's subtree gives r ⇝ w (via x) and w ⇝ r (ancestor reaches descendant via tree edges), so w ≈ r, i.e. w ∈ C. That contradicts disc[w] < disc[r] = min disc over C. Hence every contribution disc[w] ≥ disc[r], so low[r] = disc[r].

(⇒) Suppose u is not the root; let r be the root of [u], so disc[r] < disc[u] and u ∈ T(r) (Lemma 2.4). Since u ≈ r, there is a path u ⇝ r. Walk this path from u; let x → w be the first edge that leaves T(u)... we instead argue directly: u ⇝ r means starting from u we can reach r. Consider the path u ⇝ r. Because r is an ancestor of u and remains on the stack while u is processed (an SCC is popped only at its root, and r finishes after u), the path from u to the ancestor r must contain a non-tree edge x → w with x ∈ T(u) and w an on-stack ancestor with disc[w] ≤ disc[r] < disc[u] (the first edge of the path that points to a vertex with discovery time < disc[u] targets such a w). That edge contributes disc[w] ≤ disc[r] < disc[u] to low[u] via (R2) or via inheritance (R1). Therefore low[u] ≤ disc[w] < disc[u], so low[u] ≠ disc[u]. ∎

Theorem 2.6 (Pop correctness). When u is a root and the algorithm pops the stack down to and including u, the popped set is exactly [u].

Proof. By Lemma 2.4, all of [u] = C lies in T(u), and every member has disc ≥ disc[u], so each was pushed onto the stack after u and has not yet been popped (no member's SCC has closed, since u is being finished now and u is their root). Thus all of C sits at or above u on the stack. Conversely, any vertex y currently above u on the stack was discovered after u, lies in T(u), and is not yet assigned to any SCC. We show y ∈ C. Since y is on the stack and not yet popped while u (its DFS ancestor) is finishing as a root, y's own root r_y has not been reached/finished, and r_y must be u itself: if r_y were a proper descendant of u, it would have finished and popped y before u finishes (DFS post-order), contradicting y still on the stack; if r_y were a proper ancestor of u, then disc[r_y] < disc[u] and r_y ≈ y ≈(through reachability)… but u's being a root means low[u] = disc[u], so no vertex of T(u) reaches an on-stack vertex older than u; hence y cannot belong to an older component. So r_y = u and y ∈ C. The popped set equals C = [u]. ∎

This establishes total correctness: every SCC is output exactly once, at the moment its root finishes, and the popped set is precisely the component.

2.4 Why (R2) must use disc[w], not low[w]

The lowlink definition counts paths with at most one non-tree edge. Rule (R2) adds that one edge x → w and records disc[w] — the true discovery time of the endpoint. If we used low[w] instead, we would credit v with reaching whatever w can reach back to, i.e. paths with two or more non-tree edges routed through w. Such a path need not correspond to a back edge from v's subtree, and low[w] may still be mutating (if w's subtree is not finished). Both effects can lower low[v] below disc[r_v], falsely demoting a root and merging [v] with an unrelated component. The conservative disc[w] is exactly the value the invariant in Definition 2.1 requires.


3. Complexity O(V + E)

Theorem 3.1. Tarjan's algorithm runs in Θ(V + E) time and Θ(V) auxiliary space (excluding the input graph).

Proof.

Time. DFS visits each vertex once: discovery sets disc, low, pushes to the stack, marks onStackO(1) per vertex, O(V) total. Each edge u → v is examined exactly once when scanning adj[u]; the work per edge is O(1) (one of R1/R2/R3). Tree edges trigger a recursive call, but the number of tree edges is < V. The total pushes to the explicit stack is V (each vertex pushed once), and total pops is V (each popped once when its SCC is output); the inner pop loop, summed over all root events, runs V times total — amortized O(1) per vertex, not O(V) per root. Summing: O(V) + O(E) = O(V + E).

Space. Arrays disc, low, onStack are Θ(V). The explicit stack holds at most V vertices. The recursion (or simulated) stack has depth at most V. Total auxiliary space Θ(V). ∎

Remark 3.2. Each vertex is pushed onto and popped from the explicit stack exactly once over the whole run; each edge is relaxed exactly once. There is no hidden logarithmic factor — unlike Dijkstra, Tarjan has no priority queue.

3.3 The amortization argument, made precise

The only super-constant-looking step is the inner while that pops the stack at a root. Let p_r be the number of pops performed when root r finishes. Then

Σ_r p_r = Σ_r |[r]| = Σ over all SCCs of their sizes = |V| = n,

because the SCCs partition V and each vertex is popped exactly in the pop-burst of its own root. So although a single root event can pop Θ(n) vertices (one giant SCC), the sum over the whole execution is exactly n pops. Charging each pop O(1), the total pop work is O(n), not O(n) per root. This is a textbook aggregate-amortization argument: a potential function Φ = |stack| rises by 1 on each push (n pushes total) and falls by 1 on each pop (n pops total), giving amortized O(1) per vertex for all stack manipulation. Edge work is O(1) per edge by direct counting. Hence Θ(V + E). ∎


4. Worked Trace — DFS Tree with disc/low

Concrete instance (two interlocking cycles joined by a bridge edge):

adjacency:  0:[1]   1:[2]   2:[0,3]   3:[4]   4:[5]   5:[3]
cycles:     {0,1,2} via 0→1→2→0       {3,4,5} via 3→4→5→3
bridge:     2 → 3   (the only cross-component edge)

DFS tree rooted at 0; ‑‑> marks a non-tree edge to an on-stack vertex; the annotation shows the (R2) contribution it makes:

   0  disc=0  low=0
   │  (tree)
   1  disc=1  low=0          low[1] ← min via child = low[2]=0
   │  (tree)
   2  disc=2  low=0          2‑‑>0 back edge: low[2] ← min(2, disc[0]=0)=0
   │  (tree 2→3)             2‑‑>3 is a TREE edge (3 unvisited when scanned)
   3  disc=3  low=3          ROOT of {3,4,5}: low[3]==disc[3]==3
   │  (tree)
   4  disc=4  low=3          low[4] ← min via child = low[5]=3
   │  (tree)
   5  disc=5  low=3          5‑‑>3 back edge: low[5] ← min(5, disc[3]=3)=3

Event log (D = discover, F = finish, the stack grows left→right):

t event disc low SCC stack onStack emit
0 D 0 d0=0 l0=0 0 {0}
1 D 1 d1=1 l1=1 0 1 {0,1}
2 D 2 d2=2 l2=2 0 1 2 {0,1,2}
2→0 (R2) l2=min(2,0)=0 0 1 2
3 D 3 (tree 2→3) d3=3 l3=3 0 1 2 3 +3
4 D 4 d4=4 l4=4 0 1 2 3 4 +4
5 D 5 d5=5 l5=5 0 1 2 3 4 5 +5
5→3 (R2) l5=min(5,3)=3
6 F 5 l5=3≠d5 (no pop)
7 F 4 (R1: l4←l5) l4=3≠d4 (no pop)
8 F 3 (R1: l3←l4) l3=3==d3 pop 5,4,3 −5,−4,−3 {3,4,5} id=0
9 F 2 (R1: l2←0) l2=0≠d2 0 1 2 (no pop)
10 F 1 (R1: l1←l2) l1=0≠d1 0 1 2 (no pop)
11 F 0 (R1: l0←l1) l0=0==d0 pop 2,1,0 empty {0,1,2} id=1

Invariant check. At every F event, the proof's claim holds: low[v] == disc[v] fires exactly at the two true roots (3 and 0) and never at the four non-roots (5,4,2,1) — each of which had its low pulled strictly below its disc by a path leading to an older on-stack vertex (Theorem 2.5, ⇒ direction). The emission order {3,4,5} then {0,1,2} is the reverse topological order of the condensation id1 → id0 (Theorem 2.6 + Prop. 1.6): the sink component gets the smaller id. Total pops across both bursts = 3 + 3 = 6 = |V|, confirming the amortization of §3.3.


5. Comparison with Kosaraju and Gabow

Algorithm DFS passes Aux structures Transpose Comparisons/edge Worst-case time Space
Tarjan (1972) 1 disc, low, onStack, stack no disc/low min per edge Θ(V+E) Θ(V)
Kosaraju (1978) 2 visited, finish stack, compID yes none (pure reachability) Θ(V+E) Θ(V+E)
Gabow (2000) / Cheriyan–Mehlhorn (1996) 1 disc, two stacks (S, B) no stack-pop comparisons Θ(V+E) Θ(V)

Constants. All three are linear, but:

  • Tarjan does one DFS and O(1) min-arithmetic per edge. No graph duplication.
  • Kosaraju does two full DFS traversals plus building the transpose — effectively touching every edge ~3 times (forward DFS, transpose construction, transpose DFS) and doubling edge memory. In practice ~1.5–2× slower than Tarjan.
  • Gabow's path-based algorithm eliminates the low[] array, replacing it with a boundary stack B. On an edge to an on-stack vertex w it pops B while disc[B.top] > disc[w]. The amortized number of B-pops is O(V). It performs slightly fewer comparisons than Tarjan on some inputs and is often considered the "cleanest" single-pass formulation, with the same Θ(V) space and no transpose.

Output ordering. Tarjan and Gabow emit SCCs in reverse topological order of the condensation. Kosaraju emits them in topological order of the condensation when the second pass processes vertices in decreasing finish time. Either way the ordering is obtained for free.


6. Parallel SCC Complexity

Tarjan is inherently sequential: its correctness depends on a total DFS order and a single shared stack, and DFS is P-complete (lexicographically-first DFS is not known to parallelize). So parallel SCC uses different algorithms.

Forward–Backward (Fleischer–Hendrickson–Pınar 2000). Pick a pivot p; compute forward set F = Desc(p) and backward set B = Anc(p) via parallel BFS; then F ∩ B is the SCC of p, and the three remainders recurse independently.

  • Span (parallel depth): O(D · log V) expected for graphs of diameter D, since each reachability is a parallel BFS of depth O(D) and the recursion depth is O(log V) expected on many graph classes.
  • Work: O((V + E) · log V) expected — not work-optimal; it pays a log V factor over sequential Tarjan.
  • Worst case (e.g. a single long path) the work degrades toward O(V · (V + E)).

Coloring (Orzan 2004). Propagate maximum vertex labels forward to fixpoint, then peel SCCs by backward reachability from each label's source. Work O(V · (V + E)) worst case; fully vertex-centric (Pregel/BSP), so it suits distributed and GPU settings despite higher total work.

Theoretical NC question. Whether SCC (equivalently, transitive closure structure) admits a work-optimal polylog-depth parallel algorithm is tied to the complexity of reachability. Reachability is in NL and the parallel-reachability/transitive-closure bounds (O(log² V) depth, O(V³) work via repeated squaring; or M(V) work via matrix methods) govern what is achievable. No O(V+E)-work polylog-depth SCC algorithm is known — this is an open problem (Section 10).


7. Cache Behavior

SCC is a graph traversal, so its memory pattern is dominated by pointer-chasing through the adjacency structure.

  • Adjacency-list (lists of lists): each neighbor visit may miss cache; poor locality, Θ(E) largely-random accesses.
  • CSR (compressed sparse row): offsets[] + neighbors[] flat arrays. Neighbors of a vertex are contiguous, giving sequential reads within a vertex's edge list; the indirection into disc[neighbor] is still random. CSR typically gives 2–4× speedups over lists-of-lists on large graphs purely from layout.
  • disc/low/onStack arrays are indexed by vertex id; their access pattern follows the (random) traversal order, so they incur cache misses proportional to distinct vertices touched, Θ(V).
  • Explicit stack is accessed LIFO — excellent locality (top of stack stays hot).

The external-memory cost of DFS-based SCC is essentially that of DFS itself: Θ(V + E/B) I/Os in the naive model is not achievable — DFS is notoriously cache-hostile, and Θ(V + E) I/Os in the worst case for general graphs. Cache-efficient SCC for massive graphs therefore favors the BFS-based FB/coloring algorithms (BFS is far more cache- and I/O-friendly than DFS) even at higher total work.


8. Average-Case and Random-Graph Analysis

SCC structure of G(n, p) (directed Erdős–Rényi). A sharp threshold governs the emergence of a giant SCC:

  • If n·p < 1 (sub-critical), all SCCs are O(log n) in size; almost surely no giant component. The condensation is "wide and flat."
  • If n·p > 1 (super-critical), a unique giant SCC of size Θ(n) emerges almost surely, plus many small ones. This mirrors the undirected giant-component threshold (Karp 1990, The transitive closure of a random digraph).

Implications for the algorithm. Tarjan's runtime is Θ(V+E) regardless of structure, so the random-graph result does not change asymptotics — but it predicts the shape of the output: in real-world digraphs (web graph, call graphs) one observes exactly this "bow-tie" structure — one giant SCC plus many trivial ones — which is why trimming trivial SCCs (Section 3 of the Senior file) is so effective: the super-critical regime leaves a single big core after most vertices peel off.

Expected recursion depth. On G(n,p) the DFS tree height is Θ(log n) in the sub-critical regime but can be Θ(n) in the super-critical regime (the giant SCC yields a long DFS path), which is precisely why iterative Tarjan is needed in practice.


9. Space–Time Trade-offs

Variant Time Space Trade-off
Recursive Tarjan Θ(V+E) Θ(V) + native stack O(V) Simplest; native stack risks overflow at depth O(V)
Iterative Tarjan Θ(V+E) Θ(V) + heap call stack O(V) Same bounds; depth bounded by heap, not native stack
Kosaraju Θ(V+E) Θ(V+E) Conceptually simplest; pays transpose memory
Gabow (two stacks) Θ(V+E) Θ(V) Drops low[]; uses boundary stack
FB / coloring super-linear work Θ(V+E) Parallelizable; trades work for span

The low[] array (one int per vertex) is the price Tarjan pays to do one pass. Gabow shows that price can be paid in a stack instead, but the total Θ(V) space is the same. There is no known SCC algorithm using o(V) auxiliary space without revisiting the graph many times (sublinear-space reachability relates to L vs NL).


10. Comparison with Alternatives (asymptotics + constants)

Task Best approach Complexity
All SCCs, single machine Tarjan / Gabow Θ(V+E)
All SCCs, teaching/auditing Kosaraju Θ(V+E), 2 passes
All SCCs, parallel/distributed Forward–Backward (+ trim) O((V+E) log V) expected work
Just "is there a cycle?" DFS with color marks Θ(V+E), no low[] needed
Just "is G strongly connected?" one DFS from s + one on transpose from s Θ(V+E) (cheaper than full SCC)
2-SAT satisfiability Tarjan on implication graph Θ(V+E)
Mutual reachability for all pairs transitive closure Θ(V·E) or O(V^ω)

For the special question "is the whole graph one SCC?", you do not need full Tarjan: a single DFS from any s reaching all vertices, plus a single DFS from s on the transpose reaching all vertices, suffices (G is strongly connected iff both succeed). This has the same Θ(V+E) bound but smaller constants and no low[]/stack machinery.


11. Reference Implementations — Gabow, Iterative Tarjan, Condensation

Three implementations that exercise the theory above: Gabow's path-based SCC (no low[] array — the boundary stack is the lowlink, §5), iterative Tarjan (overflow-proof, §3.3 amortization), and a condensation builder (§12). Go, then Java, then Python.

11.1 Gabow's path-based SCC

Gabow keeps two stacks: S (all open vertices) and B (candidate SCC boundaries, holding disc indices). On an edge to an on-stack vertex w, pop B while its top exceeds disc[w] — collapsing every tentative boundary newer than w. When v finishes and B.top == disc[v], v is a root: pop S down to v.

Go

package main

import "fmt"

func gabowSCC(adj [][]int) (compID []int, numComps int) {
    n := len(adj)
    preorder := make([]int, n) // 1-based discovery index; 0 = unvisited
    compID = make([]int, n)
    for i := range compID {
        compID[i] = -1
    }
    var S, B []int // S: vertices, B: boundary preorder indices
    counter := 1

    var dfs func(v int)
    dfs = func(v int) {
        preorder[v] = counter
        counter++
        S = append(S, v)
        B = append(B, preorder[v]) // open a new tentative boundary
        for _, w := range adj[v] {
            if preorder[w] == 0 {
                dfs(w)
            } else if compID[w] == -1 { // w still open: collapse boundaries
                for len(B) > 0 && B[len(B)-1] > preorder[w] {
                    B = B[:len(B)-1]
                }
            }
        }
        if B[len(B)-1] == preorder[v] { // v is an SCC root
            B = B[:len(B)-1]
            for {
                w := S[len(S)-1]
                S = S[:len(S)-1]
                compID[w] = numComps
                if w == v {
                    break
                }
            }
            numComps++
        }
    }
    for v := 0; v < n; v++ {
        if preorder[v] == 0 {
            dfs(v)
        }
    }
    return compID, numComps
}

func main() {
    adj := [][]int{{1}, {2}, {0, 3}, {4}, {5}, {3}}
    comp, k := gabowSCC(adj)
    fmt.Println("compID:", comp, "numComps:", k)
}

Java

import java.util.*;

public class GabowSCC {
    static List<List<Integer>> adj;
    static int[] preorder, compID;
    static Deque<Integer> S = new ArrayDeque<>();   // vertices
    static Deque<Integer> B = new ArrayDeque<>();    // boundary preorders
    static int counter = 1, numComps = 0;

    static void dfs(int v) {
        preorder[v] = counter++;
        S.push(v);
        B.push(preorder[v]);
        for (int w : adj.get(v)) {
            if (preorder[w] == 0) dfs(w);
            else if (compID[w] == -1)             // w open: collapse boundaries
                while (B.peek() > preorder[w]) B.pop();
        }
        if (B.peek() == preorder[v]) {            // root
            B.pop();
            int w;
            do { w = S.pop(); compID[w] = numComps; } while (w != v);
            numComps++;
        }
    }

    public static int[] run(List<List<Integer>> g) {
        adj = g;
        int n = g.size();
        preorder = new int[n];
        compID = new int[n];
        Arrays.fill(compID, -1);
        for (int v = 0; v < n; v++) if (preorder[v] == 0) dfs(v);
        return compID;
    }

    public static void main(String[] args) {
        List<List<Integer>> adj = List.of(
            List.of(1), List.of(2), List.of(0, 3),
            List.of(4), List.of(5), List.of(3));
        System.out.println(Arrays.toString(run(adj)) + " comps=" + numComps);
    }
}

Python

import sys


def gabow_scc(adj):
    sys.setrecursionlimit(1 << 25)
    n = len(adj)
    preorder = [0] * n          # 1-based; 0 means unvisited
    comp_id = [-1] * n
    S, B = [], []               # S: vertices, B: boundary preorder indices
    counter = [1]
    num_comps = [0]

    def dfs(v):
        preorder[v] = counter[0]
        counter[0] += 1
        S.append(v)
        B.append(preorder[v])
        for w in adj[v]:
            if preorder[w] == 0:
                dfs(w)
            elif comp_id[w] == -1:          # w open: collapse tentative boundaries
                while B and B[-1] > preorder[w]:
                    B.pop()
        if B[-1] == preorder[v]:            # v is a root
            B.pop()
            while True:
                w = S.pop()
                comp_id[w] = num_comps[0]
                if w == v:
                    break
            num_comps[0] += 1

    for v in range(n):
        if preorder[v] == 0:
            dfs(v)
    return comp_id, num_comps[0]


if __name__ == "__main__":
    adj = [[1], [2], [0, 3], [4], [5], [3]]
    print(gabow_scc(adj))   # ([1, 1, 1, 0, 0, 0], 2)

Gabow on the trace graph produces the identical partition {0,1,2} (id 1) and {3,4,5} (id 0) as the disc/low trace in §4 — but without ever materializing a low[] array. The boundary stack B ends each root-burst by popping a single boundary entry, mirroring the moment low[v] == disc[v] fired for Tarjan.

11.2 Iterative Tarjan (overflow-proof, single pass)

The recursion in §11.1 risks native-stack overflow at depth Θ(V) (the giant-SCC regime of §8). Below is an explicit-stack Tarjan; each frame stores the next neighbor index to resume from.

Go

func tarjanIter(adj [][]int) (compID []int, numComps int) {
    n := len(adj)
    disc := make([]int, n)
    low := make([]int, n)
    onStack := make([]bool, n)
    compID = make([]int, n)
    for i := range disc {
        disc[i], compID[i] = -1, -1
    }
    timer := 0
    var stk []int
    type frame struct{ v, i int }
    for s := 0; s < n; s++ {
        if disc[s] != -1 {
            continue
        }
        call := []frame{{s, 0}}
        for len(call) > 0 {
            f := &call[len(call)-1]
            v := f.v
            if f.i == 0 {
                disc[v], low[v] = timer, timer
                timer++
                stk = append(stk, v)
                onStack[v] = true
            }
            descend := false
            for f.i < len(adj[v]) {
                w := adj[v][f.i]
                f.i++
                if disc[w] == -1 {
                    call = append(call, frame{w, 0})
                    descend = true
                    break
                } else if onStack[w] && disc[w] < low[v] {
                    low[v] = disc[w]
                }
            }
            if descend {
                continue
            }
            if low[v] == disc[v] {
                for {
                    w := stk[len(stk)-1]
                    stk = stk[:len(stk)-1]
                    onStack[w] = false
                    compID[w] = numComps
                    if w == v {
                        break
                    }
                }
                numComps++
            }
            call = call[:len(call)-1]
            if len(call) > 0 {
                if p := &call[len(call)-1]; low[v] < low[p.v] {
                    low[p.v] = low[v]
                }
            }
        }
    }
    return compID, numComps
}

Java

static int[] tarjanIter(List<List<Integer>> adj, int[] out) {
    int n = adj.size();
    int[] disc = new int[n], low = new int[n], comp = new int[n], idx = new int[n];
    boolean[] on = new boolean[n];
    Arrays.fill(disc, -1); Arrays.fill(comp, -1);
    int timer = 0, k = 0;
    Deque<Integer> stk = new ArrayDeque<>(), call = new ArrayDeque<>();
    for (int s = 0; s < n; s++) {
        if (disc[s] != -1) continue;
        call.push(s);
        while (!call.isEmpty()) {
            int v = call.peek();
            if (idx[v] == 0) { disc[v] = low[v] = timer++; stk.push(v); on[v] = true; }
            boolean descend = false;
            while (idx[v] < adj.get(v).size()) {
                int w = adj.get(v).get(idx[v]++);
                if (disc[w] == -1) { call.push(w); descend = true; break; }
                else if (on[w]) low[v] = Math.min(low[v], disc[w]);
            }
            if (descend) continue;
            if (low[v] == disc[v]) {
                int w; do { w = stk.pop(); on[w] = false; comp[w] = k; } while (w != v);
                k++;
            }
            call.pop();
            if (!call.isEmpty()) { int p = call.peek(); low[p] = Math.min(low[p], low[v]); }
        }
    }
    out[0] = k;
    return comp;
}

Python

def tarjan_iter(adj):
    n = len(adj)
    disc = [-1] * n
    low = [0] * n
    on = [False] * n
    comp = [-1] * n
    timer = k = 0
    stk = []
    for s in range(n):
        if disc[s] != -1:
            continue
        call = [(s, 0)]
        while call:
            v, i = call[-1]
            if i == 0:
                disc[v] = low[v] = timer
                timer += 1
                stk.append(v)
                on[v] = True
            descend = False
            while i < len(adj[v]):
                w = adj[v][i]
                i += 1
                if disc[w] == -1:
                    call[-1] = (v, i)
                    call.append((w, 0))
                    descend = True
                    break
                elif on[w]:
                    low[v] = min(low[v], disc[w])
            if descend:
                continue
            call[-1] = (v, i)
            if low[v] == disc[v]:
                while True:
                    w = stk.pop()
                    on[w] = False
                    comp[w] = k
                    if w == v:
                        break
                k += 1
            call.pop()
            if call:
                p = call[-1][0]
                low[p] = min(low[p], low[v])
    return comp, k

11.3 Condensation builder (deduplicated cross-edges)

Given compID from any SCC routine, contract each component and emit the simple DAG. Because Tarjan/Gabow number components in reverse topological order, the resulting adjacency only ever points from higher ids to lower ids — a free topological order (§12).

Go

func condense(adj [][]int, compID []int, k int) [][]int {
    dag := make([][]int, k)
    seen := make(map[[2]int]bool)
    for u := range adj {
        for _, v := range adj[u] {
            a, b := compID[u], compID[v]
            if a != b && !seen[[2]int{a, b}] {
                seen[[2]int{a, b}] = true
                dag[a] = append(dag[a], b)
            }
        }
    }
    return dag
}

Java

static List<List<Integer>> condense(List<List<Integer>> adj, int[] comp, int k) {
    List<List<Integer>> dag = new ArrayList<>();
    for (int i = 0; i < k; i++) dag.add(new ArrayList<>());
    Set<Long> seen = new HashSet<>();
    for (int u = 0; u < adj.size(); u++)
        for (int v : adj.get(u)) {
            int a = comp[u], b = comp[v];
            if (a != b && seen.add((long) a << 32 | (b & 0xffffffffL))) dag.get(a).add(b);
        }
    return dag;
}

Python

def condense(adj, comp, k):
    dag = [set() for _ in range(k)]
    for u in range(len(adj)):
        for v in adj[u]:
            if comp[u] != comp[v]:
                dag[comp[u]].add(comp[v])
    return [sorted(s) for s in dag]

12. Condensation-DAG Properties

Let G^{SCC} be the condensation (Def. 1.5), with k nodes and m' deduplicated edges.

Property 12.1 (Acyclic). G^{SCC} is a DAG (Prop. 1.6). Therefore it has a topological order, a well-defined set of sources (in-degree 0) and sinks (out-degree 0), and supports linear-time DAG-DP.

Property 12.2 (Tarjan/Gabow id = reverse topological order). If components are numbered in the order their roots finish, then every condensation edge [u] → [w] satisfies id([u]) > id([w]). Proof. An edge [u] → [w] comes from some x → y with x ∈ [u], y ∈ [w], [u] ≠ [w]. Since [w] is reachable from [u] but not vice versa (acyclicity), DFS finishes the root of [w] before the root of [u] (the sink-side component closes first), so it receives a smaller id. ∎ Hence iterating components in increasing id is a topological order on the reverse condensation, and decreasing id is a topological order on G^{SCC} itself — no separate topological sort needed.

Property 12.3 (Size bounds). 1 ≤ k ≤ |V|; k = |V| iff G is a DAG (every SCC trivial); k = 1 iff G is strongly connected. The condensation has m' ≤ |E| edges and m' ≤ k(k−1)/2 after dedup.

Property 12.4 (Strong-connectivity augmentation). For a condensation with k ≥ 2, let s = number of source nodes and t = number of sink nodes. The minimum number of edges to add to G to make it strongly connected is max(s, t) (Eswaran–Tarjan 1976). For k = 1 the answer is 0. This is a direct, classic application of the condensation structure.

Property 12.5 (Unique-sink reachability). A vertex r can reach all of G iff [r] is the unique source of G^{SCC}; symmetrically all of G can reach r iff [r] is the unique sink. This drives "is there a universal vertex / mother vertex" queries in Θ(V+E).

Condensation feature Graph-level meaning Linear-time use
node one maximal mutual-reachability class quotient of G
edge [u]→[w] some original edge crosses [u] into [w] DAG-DP transition
source (in-deg 0) SCC reached by nothing else candidate "mother" SCC
sink (out-deg 0) SCC reaching nothing else absorbing set / deadlock terminus
longest path longest dependency chain DAG longest-path DP
max(sources, sinks) augmentation deficiency Eswaran–Tarjan strong-connectivity

13. Open Problems

  1. Work-optimal parallel SCC. No algorithm achieves O(V+E) work with polylog(V) span. FB and coloring pay extra log or worse factors. Closing this gap (or proving it cannot be closed) is open and tied to whether reachability/DFS has an efficient parallel solution.

  2. Fully dynamic SCC. Maintaining SCCs under both edge insertions and deletions with o(m) amortized update time is open in general. Incremental-only (insertions) admits O(m·α)-style total bounds (Bender–Fineman–Gilbert–Tarjan for topological order), and decremental SCC has near-linear total-time results (Łącki; Roditty–Zwick), but a single algorithm fast for arbitrary updates remains elusive.

  3. Cache-oblivious / external-memory SCC. DFS is hard to make I/O-efficient; whether SCC admits O(sort(V+E)) I/Os (matching BFS-style bounds) in the cache-oblivious model is not fully resolved for general digraphs.

  4. Sublinear-space SCC. Reachability in O(log² n) space (Savitch) and the L = NL question bound how little working memory SCC could use; deterministic near-log-space SCC with near-linear time is open.

  5. Streaming SCC. In the semi-streaming model (O(V·polylog V) space, few passes) the achievable pass/space trade-off for exact SCC is not tight.


14. Summary

  • SCCs are the equivalence classes of mutual reachability ; they partition V uniquely and the condensation is always a DAG (Prop. 1.6).
  • Correctness rests on two facts: low[u] computed by rules (R1)–(R3) equals the lowlink of Definition 2.1 (Lemma 2.2), and low[u] == disc[u] holds iff u is its SCC's root (Theorem 2.5), via the white-path subtree-containment lemma. Popping at a root yields exactly the component (Theorem 2.6). The rule (R2) must use disc[w], not low[w], to count exactly one back edge and avoid merging components.
  • Complexity is Θ(V+E) time, Θ(V) space: each vertex pushed/popped once, each edge relaxed once, no logarithmic factor (Theorem 3.1).
  • Versus Kosaraju (2 passes + transpose, Θ(V+E) space) and Gabow (1 pass, two stacks, no low[]): all linear; Tarjan and Gabow win on constants and memory; Kosaraju wins on conceptual simplicity.
  • Parallel SCC abandons DFS for Forward–Backward / coloring, paying super-linear work for parallel span because DFS is P-complete.
  • Random digraphs exhibit a giant-SCC threshold at np = 1, explaining why trimming trivial SCCs is so effective and why recursion depth can reach Θ(V) — mandating the iterative form in practice.

Tarjan (1972) introduced the lowlink single-pass method; Kosaraju (unpublished, via Aho–Hopcroft–Ullman) and Sharir (1981) gave the two-pass version; Gabow (2000) and Cheriyan–Mehlhorn (1996) gave the path-based single-pass variant; Fleischer–Hendrickson–Pınar (2000) opened the parallel line. The algorithm is over fifty years old, runs in optimal linear time, and remains the textbook standard for decomposing directed graphs.