Radix Sort — Junior Level¶

Table of Contents¶

Introduction
Prerequisites
Glossary
Core Concepts
Big-O Summary
Real-World Analogies
Pros & Cons
Step-by-Step Walkthrough
Code Examples
Coding Patterns
Error Handling
Performance Tips
Best Practices
Edge Cases & Pitfalls
Common Mistakes
Cheat Sheet
Visual Animation
Summary
Further Reading

Introduction¶

Focus: "What is Radix Sort?" and "How can a sort beat O(n log n)?"

Radix Sort is a non-comparison sort that orders numbers (or fixed-length strings) by looking at one digit at a time. Instead of comparing two values like every other sort you've seen so far — Bubble, Insertion, Selection, Merge, Quick, Heap — Radix Sort treats each key as a sequence of digits and uses the digits as bucket addresses. There is no if (a < b) line anywhere in the inner loop.

This single change has a surprising consequence: Radix Sort runs in O(d · (n + k)) time, where d is the number of digits and k is the radix (the base — usually 10 or 256). When d is small and fixed (say, sorting 32-bit integers as 4 byte-digits), the algorithm is effectively O(n). It beats the Ω(n log n) lower bound that applies to all comparison-based sorts, because that lower bound only applies when the algorithm's only operation is "compare two keys."

There are two main variants:

LSD (Least Significant Digit) Radix Sort — process digits right to left, using a stable inner sort (almost always Counting Sort) for each digit pass. After all d passes, the array is sorted. This is the version you'll use 99% of the time for fixed-width keys.
MSD (Most Significant Digit) Radix Sort — process digits left to right, recursively partitioning the input. Useful for variable-length keys like strings, and structurally similar to Quick Sort (you're partitioning, not merging).

Radix Sort isn't magic — it trades comparison cost for digit extraction cost and bucket memory. But when the data is right (fixed-width integers, IP addresses, fixed-format strings, dates), it is by far the fastest sort in practice. GPU sorting libraries (cub::DeviceRadixSort, Thrust), database batch sorts, and high-performance distributed sorters (TeraSort) all rely on radix variants.

Prerequisites¶

Required: Arrays/slices/lists — indexing, iteration, length
Required: Integer arithmetic — division, modulo, bit-shifts
Required: Stability of a sort — what "stable" means and why it matters
Required: 07-counting-sort/junior.md — Radix Sort uses Counting Sort as its inner subroutine
Helpful: Bases and place-value notation (base 10 vs base 2 vs base 256)
Helpful: Familiarity with one comparison sort (Merge Sort or Insertion Sort) so you can contrast

Glossary¶

Term	Definition
Radix	Another word for "base." Base 10 has radix 10; base 256 has radix 256
Digit	One position in a number's representation. `123` in base 10 has digits `1, 2, 3`
LSD	Least Significant Digit — the rightmost digit (the ones place in base 10)
MSD	Most Significant Digit — the leftmost digit (highest place value)
Pass	One full scan of the array sorting by one digit position
Bucket	A holding area for elements that share the same digit value at the current position
Stable inner sort	A sort that preserves relative order of equal keys — required so earlier passes' ordering is not destroyed
Counting Sort	The standard stable O(n + k) inner sort used by Radix Sort
Non-comparison sort	Sort that does not use `<` or `>` on whole keys — accesses digit positions directly
Comparison lower bound	The theoretical Ω(n log n) bound that comparison sorts cannot beat. Radix Sort is not subject to this
d	Number of digits in the largest key (in the chosen radix)
k	The radix itself — how many distinct digit values exist (10, 256, etc.)
n	The number of elements being sorted
Byte-radix	Using radix 256 so each digit is exactly one byte — usually the fastest choice
In-place	Sort that uses O(1) extra memory. Standard Radix Sort is NOT in-place

Core Concepts¶

Concept 1: Sort by One Digit at a Time¶

The core trick of LSD Radix Sort is breathtakingly simple. Given numbers like [170, 45, 75, 90, 802, 24, 2, 66]:

Pass 1 — sort by ones digit: [170, 90, 802, 2, 24, 45, 75, 66]
Pass 2 — sort by tens digit: [802, 2, 24, 45, 66, 170, 75, 90] — wait that looks wrong. Let me redo.

Let me redo it carefully on [170, 45, 75, 90, 802, 24, 2, 66]:

Pass	Sort by	Result
1	ones (last digit)	`[170, 90, 802, 2, 24, 45, 75, 66]`
2	tens (middle digit)	`[802, 2, 24, 45, 66, 170, 75, 90]`
3	hundreds (first digit)	`[2, 24, 45, 66, 75, 90, 170, 802]` ✓

Why does this work? Because each pass must use a stable sort. When pass 2 sees two numbers with the same tens digit (e.g., 802 and 2 both have a tens digit of 0), it preserves the order pass 1 gave them — which is the correct order by ones digit. By the time the final (most significant) pass runs, lower digits already encode their relative tie-breaks.

Concept 2: Why Stability Is Non-Negotiable¶

If the inner sort is unstable, Radix Sort produces wrong answers. Suppose pass 1 sorted [12, 22] to [12, 22] (both end in 2, original order preserved). Pass 2 sorts by tens digit. Both have tens digit equal: 1 vs 2. A stable sort keeps 12 before 22 — correct. An unstable sort could swap them, giving [22, 12] — which is wrong overall.

So Radix Sort is a stable sort using a stable inner sort — a property that follows directly from the proof by induction on passes.

Concept 3: Counting Sort Is the Inner Engine¶

Almost every Radix Sort implementation uses Counting Sort as its per-digit subroutine. Counting Sort is naturally stable when written correctly (iterate right-to-left in the placement loop), runs in O(n + k) time, and works perfectly when keys are bounded (which they are — each digit is bounded by the radix).

See 07-counting-sort/junior.md for the full algorithm. In this file we just call it as a black box that "stably sorts an array by a key function."

Concept 4: Choosing the Radix Is a Trade-off¶

A key has total bit-width w (e.g., 32 bits for int32). Suppose we pick a radix of k digit values. Then the number of digits is:

d = ceil(w / log₂(k))

Radix k	log₂ k	d for 32-bit ints	Buckets per pass
2	1	32 passes	2
10	~3.3	10 passes (decimal)	10
16	4	8 passes	16
256	8	4 passes	256
65536	16	2 passes	65536

The total cost is O(d · (n + k)). As k grows, d shrinks but k itself grows. The sweet spot is around k = 256 (one byte per pass) for general data: 4 passes for 32-bit integers, 8 passes for 64-bit integers, and bucket arrays still fit comfortably in cache.

Concept 5: LSD vs MSD — Two Directions¶

Variant	Direction	Stable inner sort?	Best for
LSD	Right to left (ones → tens → hundreds)	Required	Fixed-width keys, integers
MSD	Left to right (highest → lowest)	Recursive partition	Variable-length keys, strings

LSD is iterative: d passes, each touching all n elements. MSD is recursive: it partitions into k buckets at the most-significant level, then recurses into each bucket on the next-lower digit — like Quick Sort but with k-way partition instead of 2-way.

MSD has the nice property of early termination: if a bucket only has one element (or zero), no further recursion is needed. This makes it ideal for variable-length string keys.

Concept 6: Space — Where Does the Memory Go?¶

Radix Sort needs: - O(n) for the output buffer (you can't sort in place during the bucket-place step). - O(k) for the counting array (one slot per digit value).

Total: O(n + k). For typical k = 256, the extra space beyond the output buffer is trivial. The output buffer dominates and is the same n cost you pay for Merge Sort.

Concept 7: When Radix Sort Is Faster Than Comparison Sorts¶

Radix Sort wins when: - Keys are fixed-width (32/64-bit ints, fixed-length strings, IPv4 addresses, dates). - d is small (e.g., 4-8 passes). - The radix k is small enough to fit count tables in L1/L2 cache. - Memory bandwidth (not CPU) is your bottleneck — radix sorts are linear scans that prefetch well.

For 1 million random 32-bit integers, byte-radix LSD typically beats std::sort (introsort) by 2-3×. For sorted-ish data, TimSort or Pdqsort can pull ahead. Always benchmark.

Big-O Summary¶

Operation / Case	Complexity	Notes
Time — Best	O(d · (n + k))	No best-case shortcut. Same for all inputs
Time — Average	O(d · (n + k))
Time — Worst	O(d · (n + k))
Time — fixed-width ints, byte radix	O(n) effectively (d, k constants)	4 passes for 32-bit, 8 for 64-bit
Space — Auxiliary	O(n + k)	Output buffer + count table
Stable (LSD)	Yes	Requires stable inner sort
Stable (MSD)	Yes if written carefully	Some implementations are unstable
In-place	No	Standard versions allocate an output buffer
Adaptive	No	Same time on any input shape
Comparison-based?	No	Beats the Ω(n log n) lower bound
Best for	Fixed-width integers, IPs, dates, fixed strings

Real-World Analogies¶

Concept	Analogy
LSD pass by digit	Sorting a deck of cards first by suit, then by rank — but only if the rank sort is stable
Buckets	Post office sorting mail: 10 bins by last-digit of ZIP, then 10 bins by second-to-last, etc.
MSD recursion	Library card catalog: first cabinet by first letter (A-Z), then within "S" drawer by second letter
Stability requirement	Two students with the same exam score keep their original alphabetical order between passes
Radix choice	Choosing a base — bigger base means fewer passes but more bins per pass
Beating the comparison bound	A judge skimming case numbers digit by digit, not comparing whole numbers — fewer "thinks" per case

Where the analogy breaks: real postal sorters don't need O(n) extra bins because they have physical pigeonholes; in software we pay the buffer memory.

Pros & Cons¶

Pros	Cons
Linear-time for fixed-width keys — beats O(n log n)	Requires fixed-width (or bounded-width) keys
Stable (LSD form)	Not in-place — needs O(n + k) extra memory
Excellent for integers, IPs, dates, fixed strings	Comparison sorts beat it on variable, complex types
Cache-friendly linear scans, low branching	Bucket scattering can cause cache misses if k is large
Trivially parallelizable (MSD partitions independently)	Implementing for floats/negatives needs care
Dominant on GPU workloads (Thrust, CUB)	Code is non-trivial — easy to get wrong on first try

When to use: - Sorting millions of fixed-width integers, especially in batch ETL pipelines. - IPv4 address sorting, packet processing, network analytics. - Building suffix arrays for string algorithms (see 17-string-algorithms/04-suffix-arrays/ — once that section exists). - GPU and vectorized SIMD sorts — radix dominates here. - Sorting fixed-length keys after hashing.

When NOT to use: - Sorting complex objects via a custom comparator (use Merge/Tim Sort). - Very small arrays (n < 100) — the constants in d × (n + k) lose to Insertion Sort. - Variable-length data with high digit-count variance (use Quick Sort). - Floating-point sorting unless you handle the IEEE-754 sign/exponent bits carefully. - Memory-constrained environments where O(n) auxiliary is too much.

Step-by-Step Walkthrough¶

Sorting [170, 45, 75, 90, 802, 24, 2, 66] with LSD Radix Sort, base 10 (3 digits each, leading zeros conceptually):

Input:        [170, 045, 075, 090, 802, 024, 002, 066]

Pass 1 — sort by ones digit (rightmost):

Value	Ones digit
170	0
45	5
75	5
90	0
802	2
24	4
2	2
66	6

Bucket 0: [170, 90] Bucket 2: [802, 2] Bucket 4: [24] Bucket 5: [45, 75] Bucket 6: [66]

Collect: [170, 90, 802, 2, 24, 45, 75, 66]

Pass 2 — sort by tens digit:

Value	Tens digit
170	7
90	9
802	0
2	0
24	2
45	4
75	7
66	6

Bucket 0: [802, 2] Bucket 2: [24] Bucket 4: [45] Bucket 6: [66] Bucket 7: [170, 75] Bucket 9: [90]

Collect: [802, 2, 24, 45, 66, 170, 75, 90]

Pass 3 — sort by hundreds digit:

Value	Hundreds digit
802	8
2	0
24	0
45	0
66	0
170	1
75	0
90	0

Bucket 0: [2, 24, 45, 66, 75, 90] Bucket 1: [170] Bucket 8: [802]

Collect: [2, 24, 45, 66, 75, 90, 170, 802] ✓

Three passes. n = 8 elements touched per pass. Total work ≈ 3 × (8 + 10) = 54 operations, compared to Merge Sort's ~24 comparisons + 24 moves = ~48. Radix Sort wins as n grows because passes stay constant while comparison sorts scale with log n.

Code Examples¶

Example 1: LSD Radix Sort for Non-Negative Integers (Base 10)¶

Go¶

package main

import "fmt"

// LSDRadixSort sorts non-negative ints ascending using base-10 LSD radix.
// Time: O(d · (n + k)) where d = digits in max, k = 10. Space: O(n + k).
func LSDRadixSort(arr []int) {
    if len(arr) <= 1 {
        return
    }
    maxVal := arr[0]
    for _, v := range arr {
        if v > maxVal {
            maxVal = v
        }
    }
    for exp := 1; maxVal/exp > 0; exp *= 10 {
        countingSortByDigit(arr, exp)
    }
}

// countingSortByDigit stably sorts arr by the digit at position `exp`.
// exp = 1 → ones, exp = 10 → tens, exp = 100 → hundreds, ...
func countingSortByDigit(arr []int, exp int) {
    n := len(arr)
    output := make([]int, n)
    count := make([]int, 10)

    // Count occurrences of each digit
    for _, v := range arr {
        digit := (v / exp) % 10
        count[digit]++
    }
    // Prefix sums → end positions for each digit bucket
    for i := 1; i < 10; i++ {
        count[i] += count[i-1]
    }
    // Place stably: walk right-to-left so equal-digit items keep input order
    for i := n - 1; i >= 0; i-- {
        digit := (arr[i] / exp) % 10
        count[digit]--
        output[count[digit]] = arr[i]
    }
    copy(arr, output)
}

func main() {
    data := []int{170, 45, 75, 90, 802, 24, 2, 66}
    LSDRadixSort(data)
    fmt.Println(data) // [2 24 45 66 75 90 170 802]
}

Java¶

import java.util.Arrays;

public class LSDRadixSort {
    public static void sort(int[] arr) {
        if (arr.length <= 1) return;
        int max = arr[0];
        for (int v : arr) if (v > max) max = v;
        for (int exp = 1; max / exp > 0; exp *= 10) {
            countingSortByDigit(arr, exp);
        }
    }

    private static void countingSortByDigit(int[] arr, int exp) {
        int n = arr.length;
        int[] output = new int[n];
        int[] count = new int[10];
        for (int v : arr) count[(v / exp) % 10]++;
        for (int i = 1; i < 10; i++) count[i] += count[i - 1];
        for (int i = n - 1; i >= 0; i--) {
            int digit = (arr[i] / exp) % 10;
            output[--count[digit]] = arr[i];
        }
        System.arraycopy(output, 0, arr, 0, n);
    }

    public static void main(String[] args) {
        int[] data = {170, 45, 75, 90, 802, 24, 2, 66};
        sort(data);
        System.out.println(Arrays.toString(data));
    }
}

Python¶

def lsd_radix_sort(arr):
    """Sort non-negative ints ascending. Base 10. Stable."""
    if len(arr) <= 1:
        return
    max_val = max(arr)
    exp = 1
    while max_val // exp > 0:
        _counting_sort_by_digit(arr, exp)
        exp *= 10

def _counting_sort_by_digit(arr, exp):
    n = len(arr)
    output = [0] * n
    count = [0] * 10
    for v in arr:
        count[(v // exp) % 10] += 1
    for i in range(1, 10):
        count[i] += count[i - 1]
    # walk right-to-left for stability
    for i in range(n - 1, -1, -1):
        digit = (arr[i] // exp) % 10
        count[digit] -= 1
        output[count[digit]] = arr[i]
    arr[:] = output

if __name__ == "__main__":
    data = [170, 45, 75, 90, 802, 24, 2, 66]
    lsd_radix_sort(data)
    print(data)  # [2, 24, 45, 66, 75, 90, 170, 802]

What it does: For each digit position from least significant up, stably bucket-sort the array by that digit. After the last (most significant) pass, the array is fully sorted. Run: go run main.go | javac LSDRadixSort.java && java LSDRadixSort | python radix.py

Example 2: LSD Radix Sort with Byte Radix (Base 256)¶

Go¶

package main

import "fmt"

// LSDByteRadix sorts uint32 values using 4 passes of base-256 radix.
// Faster than base-10 in practice because the count table fits in cache
// and digit extraction is a fast bit-shift + mask.
func LSDByteRadix(arr []uint32) {
    n := len(arr)
    if n <= 1 {
        return
    }
    buf := make([]uint32, n)
    for shift := uint(0); shift < 32; shift += 8 {
        var count [257]int // [0..256], extra slot for prefix-sum convenience
        for _, v := range arr {
            count[((v>>shift)&0xFF)+1]++
        }
        for i := 1; i < 257; i++ {
            count[i] += count[i-1]
        }
        for _, v := range arr {
            digit := (v >> shift) & 0xFF
            buf[count[digit]] = v
            count[digit]++
        }
        arr, buf = buf, arr
    }
}

func main() {
    data := []uint32{170, 45, 75, 90, 802, 24, 2, 66, 1<<20, 1<<31}
    LSDByteRadix(data)
    fmt.Println(data)
}

Java¶

import java.util.Arrays;

public class LSDByteRadix {
    public static void sort(int[] arr) {
        int n = arr.length;
        if (n <= 1) return;
        int[] buf = new int[n];
        for (int shift = 0; shift < 32; shift += 8) {
            int[] count = new int[257];
            for (int v : arr) count[((v >>> shift) & 0xFF) + 1]++;
            for (int i = 1; i < 257; i++) count[i] += count[i - 1];
            for (int v : arr) {
                int digit = (v >>> shift) & 0xFF;
                buf[count[digit]++] = v;
            }
            int[] tmp = arr; arr = buf; buf = tmp;
        }
        // Note: this code path assumes non-negative ints; for full signed
        // support, see middle.md for a sign-bit fix.
    }

    public static void main(String[] args) {
        int[] data = {170, 45, 75, 90, 802, 24, 2, 66, 1 << 20};
        sort(data);
        System.out.println(Arrays.toString(data));
    }
}

Python¶

def lsd_byte_radix(arr):
    """Sort uint32-like ints using 4 passes of base-256 radix."""
    n = len(arr)
    if n <= 1:
        return
    buf = [0] * n
    for shift in (0, 8, 16, 24):
        count = [0] * 257
        for v in arr:
            count[((v >> shift) & 0xFF) + 1] += 1
        for i in range(1, 257):
            count[i] += count[i - 1]
        for v in arr:
            digit = (v >> shift) & 0xFF
            buf[count[digit]] = v
            count[digit] += 1
        arr, buf = buf, arr
    # If we swapped an odd number of times, copy back. With 4 passes it ends in place.

if __name__ == "__main__":
    data = [170, 45, 75, 90, 802, 24, 2, 66, 1 << 20]
    lsd_byte_radix(data)
    print(data)

What it does: Replaces the 10-bucket decimal version with 256 buckets (one byte). For 32-bit ints, this is exactly 4 passes — fast and cache-friendly.

Example 3: MSD Radix Sort for Strings¶

Python¶

def msd_radix_sort(strings):
    """Sort fixed/variable-length strings using MSD radix recursion."""
    if len(strings) <= 1:
        return strings[:]
    output = [None] * len(strings)
    _msd(strings, 0, len(strings), 0, output)
    return strings

def _msd(arr, lo, hi, d, aux):
    if hi - lo <= 1:
        return
    R = 256  # ASCII range
    count = [0] * (R + 2)
    for i in range(lo, hi):
        c = ord(arr[i][d]) + 1 if d < len(arr[i]) else 0
        count[c + 1] += 1
    for r in range(R + 1):
        count[r + 1] += count[r]
    for i in range(lo, hi):
        c = ord(arr[i][d]) + 1 if d < len(arr[i]) else 0
        aux[lo + count[c]] = arr[i]
        count[c] += 1
    for i in range(lo, hi):
        arr[i] = aux[i]
    # recurse on each non-empty subrange (skip bucket 0 = "end of string")
    for r in range(R):
        start = lo + count[r]
        end = lo + count[r + 1]
        if end - start > 1:
            _msd(arr, start, end, d + 1, aux)

if __name__ == "__main__":
    words = ["she", "sells", "seashells", "by", "the", "seashore"]
    msd_radix_sort(words)
    print(words)

What it does: MSD recursively partitions strings by the d-th character. Each recursion shrinks the active range and advances the digit position. Bucket 0 is reserved for "string shorter than d" (so shorter strings come before their prefixes).

Coding Patterns¶

Pattern 1: Stable Inner Sort by Key Function¶

Intent: Treat each radix pass as "stably sort by key(x)." Reusable for any radix variant.

Go¶

// stableSortByKey: rearrange arr stably so that key(x) values are non-decreasing.
// key must return a value in [0, k).
func stableSortByKey(arr []int, key func(int) int, k int) {
    n := len(arr)
    out := make([]int, n)
    count := make([]int, k)
    for _, v := range arr { count[key(v)]++ }
    for i := 1; i < k; i++ { count[i] += count[i-1] }
    for i := n - 1; i >= 0; i-- {
        kx := key(arr[i])
        count[kx]--
        out[count[kx]] = arr[i]
    }
    copy(arr, out)
}

Java¶

public static void stableSortByKey(int[] arr, java.util.function.IntUnaryOperator key, int k) {
    int n = arr.length;
    int[] out = new int[n];
    int[] count = new int[k];
    for (int v : arr) count[key.applyAsInt(v)]++;
    for (int i = 1; i < k; i++) count[i] += count[i - 1];
    for (int i = n - 1; i >= 0; i--) {
        int kx = key.applyAsInt(arr[i]);
        out[--count[kx]] = arr[i];
    }
    System.arraycopy(out, 0, arr, 0, n);
}

Python¶

def stable_sort_by_key(arr, key, k):
    """Stable bucket-sort by key in [0, k)."""
    n = len(arr)
    out = [0] * n
    count = [0] * k
    for v in arr:
        count[key(v)] += 1
    for i in range(1, k):
        count[i] += count[i - 1]
    for i in range(n - 1, -1, -1):
        kx = key(arr[i])
        count[kx] -= 1
        out[count[kx]] = arr[i]
    arr[:] = out

graph TD A[Input array] --> B[Pass 1: key = ones digit] B --> C[Pass 2: key = tens digit] C --> D[Pass 3: key = hundreds digit] D --> E[Sorted array]

Pattern 2: Choose Radix from Key Width¶

Intent: Compute the digit count once.

def digit_count(max_val, radix):
    if max_val == 0:
        return 1
    d = 0
    while max_val > 0:
        d += 1
        max_val //= radix
    return d

For max_val = 4_294_967_295 and radix = 256: d = 4.

Pattern 3: Double Buffer Swap¶

Intent: Avoid copy(arr, out) on every pass — keep two buffers and swap references.

def lsd_double_buffer(arr, passes, key_fn, k):
    a, b = arr, [0] * len(arr)
    for p in range(passes):
        # stable sort from a → b using key_fn(p, x)
        # then swap a and b
        a, b = b, a
    # if `passes` is odd, copy b back to original

Error Handling¶

Error	Cause	Fix
Wrong order with negative numbers	Two's complement makes `-1` have the largest unsigned bit pattern	Flip the sign bit before sorting, flip back after
Empty array crash	`max(arr)` undefined on empty input	Add `if len(arr) == 0: return` early
Floats sorted wrong	IEEE-754 sign bit makes negatives look "larger" than positives byte-wise	Reinterpret bits: for negatives flip all bits; for non-negatives flip sign bit. Reverse after sort
String sort puts longer strings before shorter ones	Forgot the "end of string" sentinel bucket	Reserve bucket 0 for "no more characters"
Counting array index out of bounds	Digit extraction returned `radix` instead of `radix - 1`	Use `((v >> shift) & 0xFF)` and verify mask matches radix
Pass loses stability	Inner placement loop went left-to-right	Always walk right-to-left when using the "decrement count, place at count" idiom
Used `<` comparison somewhere	Hybrid mistake	LSD should have NO key comparison — only digit extraction

Performance Tips¶

Use byte radix (256) for integer keys — fast bit-shift + mask, count table fits in L1 cache.
Pre-allocate the auxiliary buffer once and reuse across passes. Don't allocate per pass.
Double-buffer ping-pong to skip the final copy(arr, out) per pass. Saves d linear passes worth of memory traffic.
Avoid count = make([]int, k) per pass on the heap — for byte radix, declare a stack-allocated [257]int array.
Skip a pass if all digits are 0 at that position (rare in random data; common when sorting small values).
Use Insertion Sort below a threshold — for sub-arrays of size ≤ 24, Insertion Sort wins. Hybrid MSD sorts always do this.
Process multi-byte digits if cache allows — radix 65536 needs a 256 KB count table; usually slower than 256 due to cache misses.

Best Practices¶

Document the key model. Are inputs non-negative int32? uint64? UTF-8 strings? Floats? Different models need different digit extraction.
Test stability explicitly. Sort pairs (key, original_index) and assert that ties preserve original index order.
Validate digit extraction. Print the first pass's output for a small input — check it's actually sorted by the chosen digit.
Prefer LSD for known-width keys. Simpler, iterative, easier to vectorize.
Prefer MSD for variable-length strings. Naturally handles "string ends here."
Don't write your own from scratch in production. Use language libraries or cub::DeviceRadixSort (GPU). Hand-roll only if you've profiled and need it.
Benchmark vs Pdqsort / TimSort on your actual data. Radix wins for many but not all integer workloads.

Edge Cases & Pitfalls¶

Empty array ([]) → return immediately. Don't try to compute max.
Single element ([42]) → return immediately. Even d=0 is fine.
All identical ([7, 7, 7]) → each pass is a no-op (all in bucket 7 or 0), but you still pay the d × n cost.
All zeros ([0, 0, 0]) → maxVal = 0, the maxVal / exp > 0 guard means zero passes — handle the 0-pass case carefully.
One huge value, rest small ([1, 2, 3, 10^9]) → d is dominated by the max → many wasted passes. Comparison sort wins here.
Negative numbers → standard LSD treats them as huge positives; either flip sign bit or partition negatives/positives first.
Floating-point → reinterpret bits as ints, transform: positives flip sign bit, negatives flip all bits. Reverse on output.
Variable-length strings in LSD → does not work cleanly; use MSD with end-of-string bucket.
Sorting stability across passes → if you copy from buffer to original incorrectly, you lose stability silently.

Common Mistakes¶

Forgetting that each pass must be stable. Using QuickSort or HeapSort as the inner sort destroys correctness.
Walking left-to-right in the placement loop. The "decrement count, place at index" idiom only stays stable when walking right-to-left.
Computing max once at the start but not for each pass. That's correct for d; what you mustn't do is recompute the max in the inner pass loop.
Allocating output and count arrays inside the inner pass. Pre-allocate once, reuse — saves significant time.
Using % 10 for a base-256 implementation. The digit extraction must match the radix.
Treating signed ints as unsigned. -1 as uint32 is 0xFFFFFFFF, the largest unsigned value, so negatives end up at the end of the sort.
Using LSD on variable-length strings without padding. Pad with zeros, or use MSD.
Choosing k = n to get "one pass." Then the count array is O(n) and the count-array initialization itself is O(n) — no real win.
Forgetting the final pass for highest digit. Loop condition maxVal/exp > 0 is correct; maxVal/exp >= 0 would loop forever.

Cheat Sheet¶

LSD Radix Sort — Quick Reference

ALGORITHM:
    sort(arr):
        find max value
        for exp = 1, 10, 100, ... while max/exp > 0:
            stable_counting_sort_by_digit(arr, exp)
        # arr is now sorted

    stable_counting_sort_by_digit(arr, exp):
        count[0..k-1] = 0
        for v in arr: count[(v/exp) % k]++
        for i in 1..k-1: count[i] += count[i-1]
        for i from n-1 down to 0:    # right-to-left for stability
            d = (arr[i]/exp) % k
            count[d]--
            output[count[d]] = arr[i]
        arr := output

COMPLEXITY:
    Time:   O(d · (n + k))   — effectively O(n) for fixed-width keys
    Space:  O(n + k)
    Stable: YES (if inner sort is stable)
    In-place: NO

CHOOSING RADIX:
    base 10  → 10 buckets, slow, demonstrative only
    base 256 → 4 passes for u32, 8 for u64, cache-friendly, default for ints

WHEN TO USE:
    - Fixed-width integers (u32, u64)
    - IP addresses, dates
    - Fixed-length strings (with MSD for variable)
    - GPU sort, vectorized sort
    - You need stability AND O(n)

NEVER USE FOR:
    - Custom comparator on objects
    - Very small arrays (n < 100)
    - Variable-width data without preprocessing

Visual Animation¶

See animation.html for an interactive visual animation of LSD Radix Sort.

The animation demonstrates: - Each pass highlights the current digit position being sorted - Buckets (0-9 in base 10) shown below the array with elements falling in - Stable collection from buckets back into the array - Color-coded states: scanning (blue), bucketing (yellow), placing (green) - Live counters: pass number, current digit, total work - Speed control + single-step mode for studying stability

Summary¶

Radix Sort is the non-comparison sort of the sorting world. It sidesteps the Ω(n log n) lower bound by treating keys as sequences of digits and using a stable inner sort (Counting Sort) on each digit position. The two flavors — LSD (right-to-left, iterative) and MSD (left-to-right, recursive) — handle different input shapes: LSD for fixed-width integers, MSD for variable-length strings.

The time complexity is O(d · (n + k)), where d is the digit count and k is the radix. With byte radix (k = 256), 32-bit integers sort in 4 passes and 64-bit in 8 passes — effectively linear time. The cost: O(n + k) auxiliary memory, not in-place, and a careful implementation needed to handle negatives, floats, and variable-length keys.

Three takeaways: 1. The inner sort MUST be stable — Counting Sort is the canonical choice. Stability is what makes the LSD chain produce a correctly sorted output. 2. Byte radix (256) is almost always the right choice for integer keys — small enough to keep count tables in cache, large enough that 4-8 passes finish the job. 3. It's not a free lunch. Radix Sort needs fixed-width keys, extra memory, and careful handling for negatives and floats. For arbitrary comparable objects, stick with Merge/Tim Sort.

Next, study 07-counting-sort/ to understand the inner subroutine in depth. Then 09-bucket-sort/ to contrast the related (but distinct) bucket approach, and 11-tim-sort/ to see the dominant general-purpose comparison sort that radix competes with.

Radix Sort — Junior Level¶

Table of Contents¶

Introduction¶

Prerequisites¶

Glossary¶

Core Concepts¶

Concept 1: Sort by One Digit at a Time¶

Concept 2: Why Stability Is Non-Negotiable¶

Concept 3: Counting Sort Is the Inner Engine¶

Concept 4: Choosing the Radix Is a Trade-off¶

Concept 5: LSD vs MSD — Two Directions¶

Concept 6: Space — Where Does the Memory Go?¶

Concept 7: When Radix Sort Is Faster Than Comparison Sorts¶

Big-O Summary¶

Real-World Analogies¶

Pros & Cons¶

Step-by-Step Walkthrough¶

Code Examples¶

Example 1: LSD Radix Sort for Non-Negative Integers (Base 10)¶

Go¶

Java¶

Python¶

Example 2: LSD Radix Sort with Byte Radix (Base 256)¶

Go¶

Java¶

Python¶

Example 3: MSD Radix Sort for Strings¶

Python¶

Coding Patterns¶

Pattern 1: Stable Inner Sort by Key Function¶

Go¶

Java¶

Python¶

Pattern 2: Choose Radix from Key Width¶

Pattern 3: Double Buffer Swap¶

Error Handling¶

Performance Tips¶

Best Practices¶

Edge Cases & Pitfalls¶

Common Mistakes¶

Cheat Sheet¶

Visual Animation¶

Summary¶

Further Reading¶