Monte Carlo vs Las Vegas Randomized Algorithms — Practice Tasks¶
All tasks must be solved in Go, Java, and Python. Always inject a seeded RNG (do not call the global default) so every run is reproducible — fix the seed in tests, log it in anything resembling production. Verify Monte Carlo answers against a deterministic oracle on small inputs, and verify Las Vegas answers are always correct (the count of attempts may vary, the answer may not). Recurring anchors: Monte Carlo error scales like
1/√N; one-sided amplification gives errorε^k; a retry loop with per-try successptakes expected1/pattempts and exceedstattempts with probability(1-p)^t.
Beginner Tasks (5)¶
Task 1 — Monte Carlo π with a seeded RNG¶
Problem. Implement estimatePi(rng, n) that throws n darts into the unit square and returns 4 × inside/n. Print the estimate and absolute error vs π for n ∈ {1e3, 1e5, 1e6}.
Constraints. Inject the RNG (Go *rand.Rand, Java Random, Python random.Random); same seed → same output.
Hint. A dart (x, y) with x, y ∈ [0,1) is inside the quarter-circle iff x² + y² ≤ 1.
Evaluation criteria. - Error roughly halves when n quadruples (the 1/√N law). - Re-running with the same seed reproduces identical numbers. - Exactly n iterations — fixed time (Monte Carlo).
Task 2 — Las Vegas: roll until a target¶
Problem. Implement rollUntil(rng, target) that rolls a fair die [1,6] until it shows target, returning the number of rolls. Run it 100000 times and print the average (should approach 6).
Constraints. 1 ≤ target ≤ 6. The returned run always ends on target (correctness is not probabilistic).
Hint. Each roll succeeds with p = 1/6; expected rolls = 1/p = 6.
Evaluation criteria. - Average over many trials is within ~1% of 6. - The function returns only when the die equals target. - Loop terminates with probability 1.
Task 3 — Classify the paradigm¶
Problem. Write classify(name) returning "Monte Carlo" or "Las Vegas" for: "pi-darts", "roll-until-six", "randomized-quickselect", "miller-rabin", "freivalds", "treap-insert". Print each with a one-line justification.
Constraints. Pure logic / lookup; no RNG needed.
Hint. Ask "is the running time fixed or the correctness fixed?" Fixed time + maybe-wrong = Monte Carlo; always-correct + random time = Las Vegas.
Evaluation criteria. - pi-darts, miller-rabin, freivalds → Monte Carlo. - roll-until-six, randomized-quickselect, treap-insert → Las Vegas. - Each justification names which guarantee is fixed.
Task 4 — Empirical 1/√N error curve¶
Problem. For N ∈ {1e2, 1e3, 1e4, 1e5, 1e6}, estimate π R = 30 times each, and print N alongside the mean absolute error. Confirm the error drops by ~√10 ≈ 3.16× per 10× increase in N.
Constraints. Inject the RNG; average over the R repeats to smooth the wobble.
Hint. Mean absolute error ≈ c/√N; plotting log-error vs log-N gives slope ≈ −1/2.
Evaluation criteria. - Error decreases monotonically (on average) as N grows. - Ratio of consecutive errors ≈ 3.16. - Output is a clean table of N vs mean error.
Task 5 — One-sided amplification counter¶
Problem. Implement roundsForError(epsSingle, target) returning the smallest k with epsSingle^k ≤ target. Print k for (epsSingle, target) pairs (0.25, 1e-6), (0.5, 1e-9), (0.25, 2^-128).
Constraints. 0 < epsSingle < 1, 0 < target < 1.
Hint. k = ceil(log(target) / log(epsSingle)).
Evaluation criteria. - (0.25, 1e-6) → 10, (0.5, 1e-9) → 30, (0.25, 2^-128) → 64. - Result is an integer and actually satisfies epsSingle^k ≤ target. - Demonstrates amplification is logarithmic in the target.
Intermediate Tasks (5)¶
Task 6 — Las Vegas → Monte Carlo by truncation¶
Problem. Given a Las Vegas routine work(rng) whose cost (number of internal steps) is random, implement truncated(rng, deadline) that runs work for at most deadline steps; if it finishes, return its correct answer, else return a fallback flagged as a guess. Empirically measure the error rate (fraction of guesses) and compare to the Markov bound E[T]/deadline.
Constraints. Model work so its step count is geometric (success prob p), giving E[T] = 1/p.
Hint. Set deadline = E[T]/ε for a target error ε; the measured guess rate should be ≤ ε.
Evaluation criteria. - Measured guess rate ≤ E[T]/deadline. - Larger deadline → fewer guesses (lower error). - The non-guessed answers are always correct.
Task 7 — Monte Carlo → Las Vegas by verification¶
Problem. Implement lasVegasFromMC(rng, mc, verify) that calls a Monte Carlo producer mc(rng) and a checker verify(ans), retrying until verify passes; return (answer, attempts). Use a toy mc that returns a correct value with probability p and a wrong value otherwise. Verify the answer is always correct and the average attempts ≈ 1/p.
Constraints. verify must be cheap and deterministic.
Hint. This is the MC→LV conversion; it works only because a verifier exists.
Evaluation criteria. - Returned answer always passes verify. - Average attempts ≈ 1/p. - Note (in a comment) why this fails without a verifier (e.g. Miller-Rabin "prime").
Task 8 — Two-sided majority amplification¶
Problem. Given a noisy boolean oracle oracle(rng) that returns the true answer with probability 1/2 + δ, implement majority(rng, oracle, k) that returns the majority of k calls. Empirically measure the error rate vs k for δ = 1/6 and confirm it decays roughly like e^{-2δ²k}.
Constraints. k odd to avoid ties; δ > 0.
Hint. Compare with the one-sided case: two-sided needs the majority and a gap δ; the error decays exponentially but slower than ε^k.
Evaluation criteria. - Error decreases as k grows. - Observed error ≲ e^{-2δ²k} (within sampling noise). - A short note contrasts this with one-sided ε^k.
Task 9 — Randomized quickselect (Las Vegas)¶
Problem. Implement quickselect(rng, a, k) returning the k-th smallest element (0-indexed) using a random pivot. Validate against a sort-based oracle on 1000 random arrays. Track and print the average number of comparisons; confirm it is O(n).
Constraints. Always correct; expected O(n) time. Do not mutate the caller's array unexpectedly (copy or document).
Hint. Random pivot → "good" split with prob ≥ 1/2 → expected linear time. Compare < pivot, > pivot, == pivot buckets.
Evaluation criteria. - Matches the sort-based oracle on every test. - Average comparisons scale linearly with n (roughly ~3.4n), not n log n. - Works for k = 0 (min), k = n-1 (max), and the median.
Task 10 — Las Vegas retry with a cap and fallback¶
Problem. Implement cappedRetry(rng, attempt, fallback, cap) that retries a random attempt up to cap times, then returns a deterministic fallback. Compute the cap from a tail target: cap = ceil(log(δ)/log(1-p)). Verify empirically that the fallback fires at most a fraction δ of the time.
Constraints. 0 < p < 1, 0 < δ < 1. The returned value is always valid.
Hint. P[> cap attempts] = (1-p)^cap ≤ δ. This is the SLA guardrail from senior.md.
Evaluation criteria. - Fallback rate ≤ δ empirically. - The returned value is always valid (fast path or fallback). - cap is derived from (p, δ), not hard-coded.
Advanced Tasks (5)¶
Task 11 — Freivalds' matrix-product check (one-sided MC, amplified)¶
Problem. Implement freivalds(rng, A, B, C, k) that returns true (probably A·B = C) or false (certainly A·B ≠ C) using k random 0/1 vectors, each round O(n²). Test on correct and corrupted C matrices; confirm a corrupted C is caught with high probability.
Constraints. Draw a fresh random vector each round (independence). n×n integer matrices.
Hint. Check A·(B·r) = C·r; a single mismatch proves inequality (a certificate). Error after k rounds ≤ 2^{-k}.
Evaluation criteria. - Correct C returns true for any k. - Corrupted C returns false with probability ≥ 1 - 2^{-k}. - Each round is O(n²) (no O(n³) product computed).
Task 12 — Monte Carlo integration with a confidence interval¶
Problem. Estimate ∫₀¹ f(x) dx for a given f by sampling N points; also compute the sample standard error s/√N and report a 95% confidence interval (estimate ± 1.96·s/√N). Test on f(x) = x² (true value 1/3) and confirm the true value lies in the interval ~95% of the time over many runs.
Constraints. Use the running mean/variance (Welford) or a two-pass formula.
Hint. This is two-sided Monte Carlo with a quantified error; the CI width shrinks like 1/√N.
Evaluation criteria. - Estimate converges to 1/3 for f(x)=x². - Coverage of the 95% CI is ≈ 95% over ≥ 200 independent runs. - CI width halves when N quadruples.
Task 13 — Reproducibility and replay harness¶
Problem. Build a harness that (a) runs a randomized routine with a seed drawn from entropy, (b) logs {seed, outcome, attempts}, and (c) given a logged seed, replays the exact run and reproduces the identical outcome. Demonstrate replay of a "rare" event (e.g. a Las Vegas run that took ≥ 10 attempts).
Constraints. All randomness must flow through the one seeded RNG; no hidden calls to the global default.
Hint. Replay = re-seed with the logged seed and re-run the deterministic step generator. If anything draws from the global RNG, replay breaks.
Evaluation criteria. - Replaying a logged seed reproduces the outcome bit-for-bit (within one language). - The harness detects/forbids use of the global RNG. - A note explains why the same seed gives different streams across Go/Java/Python.
Task 14 — Distributed randomness: correlation vs agreement¶
Problem. Simulate R replicas each routing M requests. Implement two strategies for picking a backend out of B: (a) per-replica RNG seeded from a shared constant (the bug), and (b) per-replica RNG seeded from entropy (the fix). Plot/print the load distribution across backends. Then show a third case: deterministic hashing of the key when all replicas must agree on the same backend per key.
Constraints. R, B ≥ 4, M ≥ 1000.
Hint. Shared seed → every replica picks the same sequence → hot backend. Entropy seeds → even load. Hashing → consistent per-key routing (not random).
Evaluation criteria. - The shared-seed case produces a visibly skewed (hot-backend) load. - The entropy-seed case produces near-uniform load. - The hashing case routes each key to the same backend across all replicas (agreement, not divergence).
Task 15 — Paradigm-selection advisor¶
Problem. Implement recommend(deadline_hard, error_tolerable, has_cheap_checker, adversarial) returning one of: "MONTE_CARLO", "LAS_VEGAS", "TRUNCATED_LAS_VEGAS_HYBRID", plus the RNG choice ("fast" or "CSPRNG") and a one-line justification. Encode the decision tree from senior.md.
Constraints / cases. - hard deadline + error tolerable → MONTE_CARLO. - no hard deadline + cheap checker → LAS_VEGAS. - hard deadline + error NOT tolerable + cheap checker → TRUNCATED_LAS_VEGAS_HYBRID. - adversarial = true → RNG must be CSPRNG regardless.
Hint. Name the hard constraint first; when both deadline and correctness are hard, the hybrid (truncated Las Vegas + deterministic fallback) is the answer.
Evaluation criteria. - Correctly classifies all four canonical cases. - The adversarial flag always forces CSPRNG. - Justification references the right guarantee (bounded time vs always-correct vs hybrid) and, where relevant, the SLA derivation (k, retry cap).
Benchmark Task¶
Task B — Monte Carlo work vs Las Vegas tail¶
Problem. Two micro-benchmarks across Go/Java/Python on a seeded RNG:
- (a) Monte Carlo cost vs accuracy: measure wall-clock time and absolute π error for
N ∈ {1e4, 1e5, 1e6, 1e7}. Show that time grows linearly withNwhile error falls like1/√N(so accuracy is expensive). - (b) Las Vegas attempt distribution: for a retry loop with
p ∈ {1/2, 1/6, 1/20}, run1e5trials and report mean attempts (≈1/p), p99 attempts, and max attempts. Show the tailP[>t] = (1-p)^tmatches the empirical fraction.
Starter — Python.
import math
import random
import time
def estimate_pi(rng, n):
inside = 0
for _ in range(n):
x, y = rng.random(), rng.random()
if x * x + y * y <= 1.0:
inside += 1
return 4.0 * inside / n
def retry_attempts(rng, p):
a = 0
while True:
a += 1
if rng.random() < p:
return a
def main():
rng = random.Random(99)
print("(a) Monte Carlo: N time_ms abs_error")
for n in (10_000, 100_000, 1_000_000):
t0 = time.perf_counter()
est = estimate_pi(rng, n)
ms = (time.perf_counter() - t0) * 1000
print(f" {n:<9} {ms:8.2f} {abs(est - math.pi):.5f}")
print("(b) Las Vegas: p mean p99 max")
for p in (0.5, 1 / 6, 1 / 20):
xs = sorted(retry_attempts(rng, p) for _ in range(100_000))
mean = sum(xs) / len(xs)
p99 = xs[int(0.99 * len(xs))]
print(f" {p:.3f} {mean:6.3f} {p99:4d} {xs[-1]:4d} (1/p={1/p:.2f})")
if __name__ == "__main__":
main()
Evaluation criteria. - Same seed → same numbers within each language. - (a) Monte Carlo time ∝ N; error ∝ 1/√N (10× more work ≈ 3.16× less error). - (b) Mean attempts ≈ 1/p; the empirical P[>t] matches (1-p)^t; the max grows as p shrinks (heavier tail). - A short writeup: Monte Carlo "pays time for accuracy," Las Vegas "pays a variable tail for guaranteed correctness."
General Guidance for All Tasks¶
- Inject and seed the RNG. Never reach for the global default from reusable code; pass an RNG (or seed) in. Fix the seed in tests, log it in anything production-like.
- Validate Monte Carlo against an oracle on small inputs, then trust it on large ones; validate Las Vegas correctness always (only the attempt count may vary).
- Match the combination rule to the error model. One-sided → trust any certificate (
ε^k). Two-sided → majority vote (Chernoff). Mixing them up raises error. - Size the work to the requirement. Derive Monte Carlo rounds
kfrom the target error, and Las Vegas retry capstfrom the tail targetδ; over-provisioning randomness wastes work. - Cap Las Vegas tails with a deterministic fallback when there is a latency budget.
- Use a CSPRNG on adversarial paths; a predictable PRNG lets an attacker force the worst case.
- Distinguish "random" from "agreed." Independent randomness to diverge (spread load); deterministic hashing to agree (route a key consistently).
Each task must be implemented and tested in Go, Java, and Python, producing consistent results on the same seed within each language.