Suffix Arrays — Prefix-Doubling Construction & the LCP Array

Sort all suffixes by their first 1, 2, 4, … characters using rank pairs (rank[i], rank[i+k]). Watch the order refine each round, then read off SA and the Kasai LCP array.

k = 1 step 0

Suffixes sorted by first 1 char(s)

current sort key compared prefix (2^t chars) header

Final Suffix Array & LCP

Press Step to begin building the suffix array by prefix doubling.
We sort suffixes by their first character, then double the compared length each round, sorting by the pair (rank of first k chars, rank of next k chars). When every rank is distinct, the order is the suffix array.

How prefix doubling works

Round k sorts suffixes by the pair (rank[i], rank[i+k]).
• The first component encodes the order by the first k characters.
• The second component encodes the next k characters (via suffix i+k's rank).
• Together the pair orders by the first 2k characters — in O(1) per comparison, because we compare ranks, never re-scan the string.
After each round, ranks are recomputed (equal pairs share a rank) and k doubles: 1 → 2 → 4 → … Once all ranks are distinct, the order is final. With a comparison sort this is O(n log² n); with a radix sort on the integer pairs, O(n log n).

What SA + LCP unlock

Pattern search: two binary searches bracket the contiguous occurrence range — O(m log n).
Distinct substrings: n(n+1)/2 − Σ LCP, in O(n) once SA and LCP exist.
Longest repeated substring: the maximum entry of the LCP array.
Longest common substring of two strings: build SA + LCP of A # B, take the max LCP over adjacent suffixes from opposite sides of the separator.
BWT / FM-index: BWT[i] = s[SA[i]-1], the spine of compressed full-text search.
rank columns rank pair (sort key) LCP column
Self-contained visualization. Construction is prefix doubling (O(n log² n) here; O(n log n) with a radix sort). The LCP array is filled by Kasai's O(n) algorithm. Distinct substrings = n(n+1)/2 − Σ LCP. Edit the string above (lowercase letters, up to 16 chars) and press Reset to rebuild. See junior.md, middle.md, and professional.md.