Garner's Algorithm (Mixed-Radix CRT Reconstruction) — Middle Level¶
Focus: The digit-by-digit solve with prefix-product inverses, recovering
xdirectly or onlyx mod M, doing arithmetic in a Residue Number System (RNS) and reconstructing at the end, and a side-by-side comparison with the classic CRT summation formula.
Table of Contents¶
- Introduction
- Deeper Concepts
- Comparison with Alternatives
- Advanced Patterns
- RNS Arithmetic
- Recovering x mod Another Modulus
- Code Examples
- Error Handling
- Performance Analysis
- Best Practices
- Visual Animation
- Summary
Introduction¶
At junior level the message was: Garner rebuilds x from its residues by solving mixed-radix digits one at a time. At middle level you start asking the engineering questions that decide whether your reconstruction is correct, fast, and memory-safe:
- What exactly is the inverse of the prefix product, and why is precomputing it the key optimization?
- How do I reconstruct only
x mod Mfor some target modulusM(e.g.10^9 + 7) without ever building a big integer? - What is a Residue Number System, and how does Garner sit at the boundary where RNS arithmetic ends and a real number must be recovered?
- How does Garner's mixed-radix approach compare, term by term, with the classic CRT formula of
05-crt— same answer, different cost profile? - Why is the digit solve inherently sequential, and does that matter for parallelism?
These separate "I can recite the mixed-radix formula" from "I can drop Garner into a multi-prime NTT pipeline (13-ntt) and reason about its cost."
Deeper Concepts¶
The mixed-radix solve, restated precisely¶
We seek digits a_0, a_1, …, a_{k-1} (0-indexed now) with 0 ≤ a_i < p_i such that
W_i is the prefix product: the weight of digit a_i. Reducing x mod p_i kills every term with index > i (each carries the factor p_i), and kills the future terms entirely, so
Solving for a_i:
The term (W_i)^{-1} is the modular inverse of the prefix product W_i modulo p_i. Because the p_j are pairwise coprime, W_i = ∏_{j<i} p_j is coprime to p_i, so this inverse always exists.
Two ways to evaluate the partial sum¶
The bracket Σ_{j<i} a_j·W_j (mod p_i) can be computed two ways:
- Re-derive per row (
O(i)work for digiti,O(k²)total): recompute the prefix product modp_iand the running partial from scratch for eachi. Simple, no shared state. - Coefficient form (also
O(k²)): maintain, for each modulusp_i, a running combination; this is the "Garner with precomputed inverse table" variant used in NTT libraries.
Both are O(k²); the second amortizes the inverses when the primes are fixed across many reconstructions.
Why prefix-product inverses are the crux¶
The single multiplication by (W_i)^{-1} mod p_i is what "divides out" the accumulated weight so a_i lands in [0, p_i). If you compute these inverses naively inside the loop, you pay O(log p_i) per digit (extended Euclid). If the primes are fixed (the NTT case), precompute the table invW[i] = (p_0·…·p_{i-1})^{-1} mod p_i once, and each reconstruction is pure multiply-add mod p_i.
Uniqueness ⇒ correctness¶
Because the mixed-radix representation with 0 ≤ a_i < p_i is unique (proof in professional.md), solving the digits greedily — each forced by the congruence mod p_i — yields the one true answer. There is no backtracking and no ambiguity. This uniqueness is the same fact that makes base-10 digits unique; Garner just generalizes the base.
Comparison with Alternatives¶
Garner (mixed-radix) vs the classic CRT formula¶
The classic CRT formula (sibling 05-crt) reconstructs x as a single weighted sum:
Garner instead builds the mixed-radix digits and sums Σ a_i W_i. Same x, different arithmetic:
| Aspect | Classic CRT formula | Garner (mixed-radix) |
|---|---|---|
| Big numbers in intermediate steps | Yes — each M_i = P/p_i is nearly as big as P. | No — digit solve is all mod single primes; big numbers only in final assembly. |
| Modular inverses needed | k inverses, each (M_i)^{-1} mod p_i. | k prefix-product inverses, each (W_i)^{-1} mod p_i. |
Final reduction mod P | Required (Σ r_i M_i y_i can exceed P). | Not required — the mixed-radix sum is already in [0, P). |
| Incremental (add a prime) | Recompute P, all M_i, all y_i. | Append one digit a_k; earlier digits unchanged. |
Reconstructing only x mod M | Must reduce big M_i mod M. | Reduce the small mixed-radix sum mod M. |
| Time | O(k²) (plus big-int costs for M_i). | O(k²), all small until final assembly. |
The practical verdict: Garner is preferred whenever you want the actual big integer or want to avoid forming P/p_i. The classic formula is conceptually simpler and fine when everything is reduced mod a small P that fits in a machine word.
Garner vs repeated two-modulus CRT merge¶
You can also merge congruences two at a time: combine (r_0, p_0) and (r_1, p_1) into one congruence mod p_0 p_1, then merge that with (r_2, p_2), etc. This is algebraically equivalent to Garner, but the intermediate modulus p_0·p_1·… grows and you are back to big-integer multiplies at every merge. Garner's mixed-radix bookkeeping keeps each digit-solve small; only the final assembly grows.
Advanced Patterns¶
Pattern: Garner reconstructing into a big integer¶
The general-purpose form: solve digits, then assemble with big-integer arithmetic. (Shown in junior.md; here we add the precomputed-inverse variant.)
Pattern: Garner with precomputed prefix-product inverses (fixed primes)¶
When the primes are fixed (NTT, RNS), precompute invW[i] = (p_0·…·p_{i-1})^{-1} mod p_i and reuse across reconstructions.
Go¶
package main
import (
"fmt"
"math/big"
)
func modInv(a, m int64) int64 {
A := big.NewInt(((a % m) + m) % m)
inv := new(big.Int).ModInverse(A, big.NewInt(m))
return inv.Int64()
}
type GarnerCtx struct {
p []int64
invW []int64 // invW[i] = (p0*...*p_{i-1})^{-1} mod p[i]
}
func newGarner(p []int64) *GarnerCtx {
k := len(p)
invW := make([]int64, k)
for i := 0; i < k; i++ {
var prefix int64 = 1
for j := 0; j < i; j++ {
prefix = (prefix * p[j]) % p[i]
}
invW[i] = modInv(prefix, p[i]) // (W_i)^{-1} mod p[i]
}
return &GarnerCtx{p: p, invW: invW}
}
// digits returns the mixed-radix digits a[0..k-1].
func (g *GarnerCtx) digits(r []int64) []int64 {
k := len(g.p)
a := make([]int64, k)
for i := 0; i < k; i++ {
var partial, prefix int64 = 0, 1
for j := 0; j < i; j++ {
partial = (partial + a[j]*prefix) % g.p[i]
prefix = (prefix * g.p[j]) % g.p[i]
}
diff := ((r[i]-partial)%g.p[i] + g.p[i]) % g.p[i]
a[i] = (diff * g.invW[i]) % g.p[i]
}
return a
}
func (g *GarnerCtx) reconstruct(r []int64) *big.Int {
a := g.digits(r)
x := big.NewInt(0)
w := big.NewInt(1)
for i := range g.p {
x.Add(x, new(big.Int).Mul(big.NewInt(a[i]), w))
w.Mul(w, big.NewInt(g.p[i]))
}
return x
}
func main() {
g := newGarner([]int64{3, 5, 7})
fmt.Println(g.reconstruct([]int64{2, 3, 2})) // 23
}
Java¶
import java.math.BigInteger;
public class GarnerCtx {
final long[] p;
final long[] invW;
GarnerCtx(long[] p) {
this.p = p;
int k = p.length;
invW = new long[k];
for (int i = 0; i < k; i++) {
long prefix = 1;
for (int j = 0; j < i; j++) prefix = (prefix * p[j]) % p[i];
invW[i] = BigInteger.valueOf(((prefix % p[i]) + p[i]) % p[i])
.modInverse(BigInteger.valueOf(p[i])).longValue();
}
}
long[] digits(long[] r) {
int k = p.length;
long[] a = new long[k];
for (int i = 0; i < k; i++) {
long partial = 0, prefix = 1;
for (int j = 0; j < i; j++) {
partial = (partial + a[j] * prefix) % p[i];
prefix = (prefix * p[j]) % p[i];
}
long diff = ((r[i] - partial) % p[i] + p[i]) % p[i];
a[i] = (diff * invW[i]) % p[i];
}
return a;
}
BigInteger reconstruct(long[] r) {
long[] a = digits(r);
BigInteger x = BigInteger.ZERO, w = BigInteger.ONE;
for (int i = 0; i < p.length; i++) {
x = x.add(BigInteger.valueOf(a[i]).multiply(w));
w = w.multiply(BigInteger.valueOf(p[i]));
}
return x;
}
public static void main(String[] args) {
GarnerCtx g = new GarnerCtx(new long[]{3, 5, 7});
System.out.println(g.reconstruct(new long[]{2, 3, 2})); // 23
}
}
Python¶
class GarnerCtx:
def __init__(self, p):
self.p = p
self.invW = []
for i in range(len(p)):
prefix = 1
for j in range(i):
prefix = (prefix * p[j]) % p[i]
self.invW.append(pow(prefix % p[i], -1, p[i]))
def digits(self, r):
p, invW = self.p, self.invW
a = [0] * len(p)
for i in range(len(p)):
partial, prefix = 0, 1
for j in range(i):
partial = (partial + a[j] * prefix) % p[i]
prefix = (prefix * p[j]) % p[i]
diff = (r[i] - partial) % p[i]
a[i] = (diff * invW[i]) % p[i]
return a
def reconstruct(self, r):
a = self.digits(r)
x, w = 0, 1
for i in range(len(self.p)):
x += a[i] * w
w *= self.p[i]
return x
if __name__ == "__main__":
g = GarnerCtx([3, 5, 7])
print(g.reconstruct([2, 3, 2])) # 23
Pattern: Three-prime NTT combine¶
In multi-prime NTT (13-ntt) you compute a convolution mod three NTT-friendly primes q_0, q_1, q_2, getting residues (c_0, c_1, c_2) for each output coefficient. Garner combines them into the true coefficient (or that coefficient mod the problem's modulus M). Because the primes are fixed, the prefix-product inverses are precomputed once and applied to every coefficient — turning combine into a constant number of multiplies per coefficient.
RNS Arithmetic¶
A Residue Number System (RNS) represents an integer x not by its digits but by its residue tuple
The magic: addition, subtraction, and multiplication are componentwise and carry-free:
(x + y) mod p_i = ( (x mod p_i) + (y mod p_i) ) mod p_i,
(x · y) mod p_i = ( (x mod p_i) · (y mod p_i) ) mod p_i.
So you can do a long chain of big-number arithmetic entirely in residue space — each prime independent, embarrassingly parallel, no carries propagating across machine words. Garner is how you leave RNS: when you finally need the actual integer (or its value mod M), you Garner-reconstruct from the residue tuple.
| RNS operation | Cost | Notes |
|---|---|---|
Convert x → residues | O(k · cost of one mod) | Or compute residues directly. |
| Add / multiply two RNS numbers | O(k) | Componentwise, parallelizable. |
| Compare / detect overflow | hard in RNS | RNS has no easy magnitude comparison — a known limitation. |
Convert residues → x (reconstruct) | O(k²) Garner | The exit door from residue space. |
RNS shines for bulk multiply-heavy workloads (cryptography, NTT, big-integer multiplication) where you do many cheap parallel operations and reconstruct rarely. Its weaknesses — comparison, division, sign — are exactly the operations that need the reconstructed magnitude, which is why Garner is the indispensable companion.
Recovering x mod Another Modulus¶
Often you do not want the full big integer x; you want x mod M where M is the problem's answer modulus (say 10^9 + 7). Two clean approaches:
Approach 1: Assemble the mixed-radix sum mod M¶
Solve the digits a_i (all mod p_i, as usual), then accumulate the final sum mod M instead of with big integers:
maintaining the prefix-product weight W_i mod M incrementally. No big integers at all — every value stays a machine word. This is the standard trick for multi-prime NTT where the final coefficients are reported mod M.
Approach 2: Reconstruct big x, then reduce¶
If a big-integer library is available, build x exactly and take x mod M at the end. Simpler to read, but allocates big integers.
Python (Approach 1, no big integers)¶
def garner_mod_M(r, p, M):
k = len(p)
a = [0] * k
for i in range(k):
partial, prefix = 0, 1
for j in range(i):
partial = (partial + a[j] * prefix) % p[i]
prefix = (prefix * p[j]) % p[i]
diff = (r[i] - partial) % p[i]
a[i] = (diff * pow(prefix, -1, p[i])) % p[i]
# assemble mod M only
x, w = 0, 1
for i in range(k):
x = (x + a[i] * w) % M
w = (w * p[i]) % M
return x
if __name__ == "__main__":
print(garner_mod_M([2, 3, 2], [3, 5, 7], 1_000_000_007)) # 23
Go (Approach 1)¶
package main
import (
"fmt"
"math/big"
)
func modInv(a, m int64) int64 {
return new(big.Int).ModInverse(big.NewInt(((a%m)+m)%m), big.NewInt(m)).Int64()
}
func garnerModM(r, p []int64, M int64) int64 {
k := len(p)
a := make([]int64, k)
for i := 0; i < k; i++ {
var partial, prefix int64 = 0, 1
for j := 0; j < i; j++ {
partial = (partial + a[j]*prefix) % p[i]
prefix = (prefix * p[j]) % p[i]
}
diff := ((r[i]-partial)%p[i] + p[i]) % p[i]
a[i] = (diff * modInv(prefix, p[i])) % p[i]
}
var x, w int64 = 0, 1
for i := 0; i < k; i++ {
x = (x + a[i]*w) % M
w = (w * p[i]) % M
}
return x
}
func main() {
fmt.Println(garnerModM([]int64{2, 3, 2}, []int64{3, 5, 7}, 1_000_000_007)) // 23
}
Java (Approach 1)¶
import java.math.BigInteger;
public class GarnerModM {
static long modInv(long a, long m) {
return BigInteger.valueOf(((a % m) + m) % m)
.modInverse(BigInteger.valueOf(m)).longValue();
}
static long garnerModM(long[] r, long[] p, long M) {
int k = p.length;
long[] a = new long[k];
for (int i = 0; i < k; i++) {
long partial = 0, prefix = 1;
for (int j = 0; j < i; j++) {
partial = (partial + a[j] * prefix) % p[i];
prefix = (prefix * p[j]) % p[i];
}
long diff = ((r[i] - partial) % p[i] + p[i]) % p[i];
a[i] = (diff * modInv(prefix, p[i])) % p[i];
}
long x = 0, w = 1;
for (int i = 0; i < k; i++) {
x = (x + a[i] * w) % M;
w = (w * p[i]) % M;
}
return x;
}
public static void main(String[] args) {
System.out.println(garnerModM(new long[]{2, 3, 2}, new long[]{3, 5, 7}, 1_000_000_007L)); // 23
}
}
Code Examples¶
Reusable Garner that returns digits and either form¶
This factors the digit solve out so the same engine drives big-int assembly, mod-M assembly, or just inspecting the mixed-radix digits.
Python¶
def garner_digits(r, p):
"""Return mixed-radix digits a[i] (0 <= a[i] < p[i])."""
k = len(p)
a = [0] * k
for i in range(k):
partial, prefix = 0, 1
for j in range(i):
partial = (partial + a[j] * prefix) % p[i]
prefix = (prefix * p[j]) % p[i]
diff = (r[i] - partial) % p[i]
a[i] = (diff * pow(prefix, -1, p[i])) % p[i]
return a
def assemble_bigint(a, p):
x, w = 0, 1
for ai, pi in zip(a, p):
x += ai * w
w *= pi
return x
def assemble_mod(a, p, M):
x, w = 0, 1
for ai, pi in zip(a, p):
x = (x + ai * w) % M
w = (w * pi) % M
return x
if __name__ == "__main__":
r, p = [2, 3, 2], [3, 5, 7]
a = garner_digits(r, p)
print("digits:", a) # [2, 2, 1]
print("bigint:", assemble_bigint(a, p)) # 23
print("mod 100:", assemble_mod(a, p, 100)) # 23
Error Handling¶
| Scenario | What goes wrong | Correct approach |
|---|---|---|
| Moduli not pairwise coprime | Prefix-product inverse fails to exist. | Pre-check all gcd(p_i, p_j) = 1; otherwise use a generalized merge. |
| Un-reduced residues | r_i ≥ p_i poisons digit a_i. | Reduce r_i ← r_i mod p_i up front. |
| Negative partial | (r_i − partial) negative before mod. | ((diff % p_i) + p_i) % p_i. |
Overflow in a_j·prefix | Product exceeds 64 bits for big primes. | 128-bit multiply or big.Int/BigInteger. |
| Wrong weight in assembly | Used W_i mod M inconsistent with digit index. | Maintain w incrementally: w *= p_i after using it. |
| Reconstructing when you only needed residues | Wasted big-int work. | Skip reconstruction; stay in RNS until truly needed. |
Performance Analysis¶
k (primes) | Digit solve O(k²) | Inverses (precomputed once) | Notes |
|---|---|---|---|
| 2 | ~4 mults | 2 | Trivial; even a two-modulus merge suffices. |
| 3 | ~9 mults | 3 | The common NTT case. |
| 8 | ~64 mults | 8 | Still microseconds per reconstruction. |
| 32 | ~1024 mults | 32 | Big-integer assembly now dominates. |
| 100 | ~10⁴ mults | 100 | Consider whether you really need 100 primes. |
The digit solve is O(k²) modular multiplies on single-prime-sized numbers. If primes are fixed and inverses are precomputed, the per-reconstruction cost is pure multiply-add — which is why NTT libraries Garner-combine millions of coefficients cheaply. The big-integer assembly is O(k) big-int operations on numbers up to ~k·log p bits; for large k it can dominate, which is the reason to prefer the mod-M assembly whenever the full integer is not required.
import time
def benchmark(k):
import random
# k small distinct primes for the demo
primes = [3, 5, 7, 11, 13, 17, 19, 23, 29, 31][:k]
r = [random.randrange(p) for p in primes]
start = time.perf_counter()
_ = garner_digits(r, primes)
return time.perf_counter() - start
The single biggest constant-factor win when primes are fixed is precomputing the prefix-product inverses so the inner loop is multiply-add only.
Best Practices¶
- Reduce residues first:
r_i ← r_i mod p_ibefore any digit solve. - Precompute prefix-product inverses when the primes are fixed (NTT/RNS); the per-reconstruction loop then has no inverse calls.
- Choose the assembly to match the need: big integer only if you truly need the exact value; otherwise assemble mod
Mand stay in machine words. - Keep
modInversea tested helper: most subtle bugs live in the inverse or the sign normalization. - Verify pairwise coprimality once, loudly; the failure is silent otherwise.
- Test against brute-force CRT on random small inputs, and against the classic CRT formula for a few cases to cross-check.
Visual Animation¶
See
animation.htmlfor an interactive view.The middle-level animation highlights: - Editable residues
r_iand primesp_i. - Each digita_icomputed in turn: subtract the partial reconstruction, multiply by(W_i)^{-1} mod p_i. - The running reconstructed valuea_0 + a_1 p_0 + a_2 p_0 p_1 + …growing step by step, alongside the prefix-product weights.
Summary¶
Garner's mixed-radix reconstruction solves the digits a_i sequentially, each from one modular inverse of a prefix product taken mod p_i, so every digit-solve stays single-prime sized. You can then assemble the full big integer x = Σ a_i W_i or, more cheaply, assemble only x mod M with no big integers at all — the standard move when combining multi-prime NTT results (13-ntt). Compared with the classic CRT summation formula (05-crt), Garner never forms the giant P/p_i terms, is naturally incremental, and needs no final reduction mod P. It is also the exit door from a Residue Number System: do bulk carry-free componentwise arithmetic in residue space, then Garner-reconstruct the magnitude exactly when you need it. The whole reconstruction is O(k²), dominated by the digit solve (small numbers) plus the optional big-integer assembly.