Count-Min Sketch — Interview Preparation¶

The Count-Min Sketch (CMS) is a favourite interview topic because it rewards one crisp insight — "keep a d × w grid; add 1 to one cell per row on update; report the minimum of an item's d cells on query; every cell is ≥ the true count, so the answer overcounts only" — and then tests whether you can (a) size the grid from eps/delta (w = ceil(e/eps), d = ceil(ln(1/delta))), (b) prove the â ≤ f + eps·N bound via Markov-on-a-row × independence-across-rows, (c) explain the conservative update and heavy-hitters-with-a-heap extensions, and (d) place CMS among its relatives — it overcounts where Misra-Gries undercounts, where Space-Saving overcounts-but-ranks, and where a Bloom filter only tests membership. This file is a tiered question bank (45 questions) plus a runnable coding challenge in Go, Java, and Python.

Quick-Reference Cheat Sheet¶

Question	Answer
What does it answer?	Per-key frequency `≈ f(x)`, for any key you name, in `O(d)` time, `O(d·w)` space.
Update rule?	For each row `i`: `C[i][h_i(x) % w] += c` (one cell per row).
Query rule?	`â(x) = min over rows i of C[i][h_i(x) % w]`.
Error direction?	Overcount only: `f(x) ≤ â(x) ≤ f(x) + eps·N`.
Why min, not avg?	Each cell `≥ f(x)`; the min is the least-polluted cell and still a valid upper bound.
Size width `w`?	`ceil(e/eps)` — controls error magnitude.
Size depth `d`?	`ceil(ln(1/delta))` — controls failure probability.
Proof in one line?	Markov per row (`E[overflow] ≤ N/w ≤ eps·N/e` → `1/e`) × `d` independent rows (`(1/e)^d ≤ delta`).
Conservative update?	Raise each cell only to `estimate + c` (`max`); lower bias, breaks mergeability.
Find heavy hitters?	Pair CMS with a min-heap of top-`k` candidates.
Mergeable?	Yes — element-wise add of same-shaped sketches.
vs Misra-Gries?	MG is deterministic and undercounts; CMS is randomized and overcounts.
vs Bloom filter?	Bloom = membership via bits + AND; CMS = frequency via counters + MIN.
vs HyperLogLog?	HLL counts distinct items (cardinality); CMS counts occurrences (frequency).

The update/query core (memorize this):

add(x, c):       for i in 0..d-1: C[i][h_i(x) % w] += c
estimate(x):     return min over i of C[i][h_i(x) % w]
# space O(d*w); update/query O(d); estimate >= true count (overcount only)

Junior Questions¶

#	Question	Expected Answer Focus
1	What is a Count-Min Sketch and when would you use it?	A `d×w` grid of counters + `d` hashes that estimates per-item frequencies in fixed memory; use for stream frequency estimation when an exact hash map won't fit.
2	How do you record (update) an item?	Hash with each of the `d` hash functions; add 1 (or weight `c`) to that one cell in each row — `d` cells total.
3	How do you query an item's count?	Read the `d` cells it maps to and return the minimum.
4	Why is the estimate never too low?	Every occurrence increments all `d` of the item's cells, so each cell ≥ the true count; the min of values each ≥ true is still ≥ true.
5	Why does the estimate overcount sometimes?	Different items collide into the same cell, so a cell may include other items' counts.
6	Why take the minimum instead of the average?	The min is the least-polluted cell and is still a valid upper bound; averaging breaks the overcount-only guarantee.
7	What does each row use that's different?	Its own independent hash function, so collisions in one row are unlikely to repeat in another.
8	What's the space cost?	`O(d·w)` — fixed, independent of stream length or number of distinct items.
9	What's the time per update and per query?	`O(d)` — one hash + access per row; `d` is small (4–8).
10	Can a never-inserted item report a nonzero count?	Yes — collisions from other items can leave nonzero values; the min keeps it small.
11	Does stream order affect the result?	No — updates are additions (commutative), so the final table and all estimates are order-independent.
12	What two parameters control accuracy and how?	`w` (width) controls error size; `d` (depth) controls the probability the bound fails.

Middle Questions¶

#	Question	Expected Answer Focus
1	Given `eps` and `delta`, how big is the grid?	`w = ceil(e/eps)`, `d = ceil(ln(1/delta))`; guarantee `â ≤ f + eps·N` w.p. `1-delta`.
2	Why does width scale as `1/eps` but depth as `ln(1/delta)`?	Error magnitude `~N/w` is linear in `1/w`; failure probability `(1/e)^d` shrinks exponentially in `d`.
3	What is the conservative (minimal-increment) update?	Read estimate `m`, then set each cell to `max(cell, m+c)` instead of blindly `+c`; reduces bias on skewed data.
4	Why does conservative update stay correct?	It only ever raises cells, and `m+c ≥ f_new`, so every cell stays ≥ true count → still no undercount.
5	What does conservative update cost you?	Mergeability (the `max` isn't additive) and deletions; plus one `estimate()` read per update.
6	How do you find heavy hitters with CMS?	Pair it with a min-heap of top-`k` candidates: on each update, query the estimate and update/replace the heap's smallest.
7	CMS vs Misra-Gries — key difference?	CMS is randomized and overcounts (point queries for any key); MG is deterministic, undercounts, and enumerates heavy hitters in `O(k)`.
8	CMS vs Bloom filter?	Bloom answers membership (bits + AND, false positives); CMS answers frequency (counters + MIN, overcount).
9	CMS vs Space-Saving?	Both overcount; Space-Saving is deterministic, `O(k)`, and ranks the top-`k`, but only tracks `k` keys; CMS answers any key.
10	What is the Count-Min-Mean variant?	Subtract each cell's estimated noise `(N-cell)/(w-1)` and take the median of residuals; lower bias, loses the strict upper bound.
11	When is CMS a poor choice?	When you need exact counts, when undercounting is the safer error, or when you must list heavy hitters without a heap.
12	Why is CMS great for skewed (Zipfian) data?	A few elephants give near-exact estimates; conservative update keeps them from inflating mice.
13	How does the absolute error grow with the stream?	As `eps·N` — it grows with total volume `N`; that's why long streams need windowing/decay.
14	How do you get `d` hashes cheaply?	Double hashing: `h_i(x) = a + i·b` from two base hashes, with `b` not a multiple of `w`.

Senior Questions¶

#	Question	Expected Answer Focus
1	What makes CMS ideal for distributed stream monitoring?	Mergeability: identically-shaped sketches add element-wise; agents count locally, an aggregator sums to reconstruct global counts.
2	What must all agents agree on to merge?	Same `d`, `w`, and hash seeds; epoch-fence config rollouts; mismatched shapes make merges meaningless.
3	Why can't you merge conservative-update sketches?	The `max`-based update isn't additive — `max(A,B)` doesn't reconstruct the union; use additive updates at the edge.
4	How do you keep error bounded on an infinite stream?	Windowing (per-bucket sketches you merge/drop) or exponential decay (multiply cells by `γ<1`) so `N` stays bounded.
5	How do you handle a single elephant flow dominating the tail?	Split: exact top-`k` heap for elephants + CMS for the tail; and/or conservative update.
6	How do range queries work on CMS?	A dyadic hierarchy of `log U` sketches; decompose `[lo,hi]` into `O(log U)` dyadic intervals and sum their min-estimates.
7	What's the error of a range query?	About `eps·N·log U`, since error accumulates across the `O(log U)` summed cells.
8	How would you observe CMS health in production?	Track `N`, `fill_ratio` (nonzero cells), merge lag, top-key share, and a sampled shadow exact counter to verify the bound.
9	CMS vs HyperLogLog — when each?	CMS for per-item frequency; HLL for distinct-count (cardinality); often run both on the same stream.
10	How is CMS used in a query optimizer?	Continuously refreshed per-value frequency stats feed selectivity estimates for join ordering and index decisions.
11	Where is CMS used in production systems?	Redis `CMS.*`, network telemetry, Apache Druid/Spark approximate aggregations, hot-key detection.
12	Adversarial collisions — what's the risk and fix?	An attacker who learns the seeds can craft colliding keys to blow up estimates; salt/secret per-deployment seeds.

Professional Questions¶

#	Question	Expected Answer Focus
1	Prove `â(x) ≥ f(x)` always.	Each cell = `f(x)` + nonnegative collision noise ≥ `f(x)`; min of such values ≥ `f(x)`.
2	Prove `â(x) ≤ f(x)+eps·N` w.p. `1-delta`.	Per-row overflow `E[X_i] ≤ N/w ≤ eps·N/e`; Markov → `Pr[X_i ≥ eps·N] ≤ 1/e`; independence → `(1/e)^d ≤ delta` for `d=ln(1/delta)`.
3	Why exactly the constant `e` in `w = e/eps`?	It makes the one-row Markov bound come out to exactly `1/e`, minimizing width for that per-row failure.
4	What independence do the hashes need?	Pairwise (2-universal) independence: `Pr[h_i(x)=h_i(y)] = 1/w` — full independence is not required.
5	Is CMS space-optimal?	Yes up to lower-order factors: `Ω(1/eps)` from an INDEXING reduction, `Ω(log(1/delta))` from repetition — matches `Θ((1/eps)log(1/delta))`.
6	Prove conservative update never undercounts.	Before update each cell ≥ `f_old`; new value `max(cell, m+c)` with `m ≥ f_old` ⇒ `m+c ≥ f_new` ⇒ cell ≥ `f_new`.
7	Prove conservative ≤ plain pointwise.	Induction: conservative raises to `max(cell, m+c)` ≤ the plain cell which received full `+c` on a value at least as large.
8	What error norm does CMS guarantee, and why does it matter?	L₁ (`eps·‖f‖₁`); same slack for every key, good for heavy hitters, poor for ranking the tail.
9	How does Count Sketch differ in its guarantee?	Signed hashes + median give an unbiased estimator with L₂ error (`eps·‖f‖₂`, tighter on skew) and support deletions, at `1/eps²` space.
10	Why is plain CMS biased even at the min?	`E[min_i X_i] > 0` — every cell carries some nonnegative noise; CMM de-biases by subtracting estimated noise.
11	Can CMS handle deletions?	Not plain CMS (subtraction breaks the min/no-undercount); use Count Sketch (signed) or a turnstile-safe variant.
12	Summarize the guarantee directions across summaries.	CMS & Space-Saving overcount; Misra-Gries undercounts; Bloom false-positives; Count Sketch is two-sided/unbiased.

Coding Challenge (3 Languages)¶

Challenge: Top-K Frequent Words via Count-Min Sketch + Heap¶

Given a stream of words and an integer k, return the k most frequent words using a Count-Min Sketch (sized from eps/delta) paired with a min-heap — without storing every distinct word. Your solution must: 1. Size the grid as w = ceil(e/eps), d = ceil(ln(1/delta)). 2. Update one cell per row; estimate via the min across rows. 3. Maintain a min-heap of the top-k candidates by estimated count. 4. Return the top-k words sorted by estimated frequency (descending).

Test: stream ["a","b","a","c","a","b","a","d","b","a"], k=2 → top-2 should be a (≈5) then b (≈3).

Go¶

package main

import (
    "container/heap"
    "fmt"
    "hash/fnv"
    "math"
    "sort"
)

type CMS struct {
    d, w int
    t    [][]uint64
}

func NewCMS(eps, delta float64) *CMS {
    w := int(math.Ceil(math.E / eps))
    d := int(math.Ceil(math.Log(1 / delta)))
    t := make([][]uint64, d)
    for i := range t {
        t[i] = make([]uint64, w)
    }
    return &CMS{d, w, t}
}

func (c *CMS) cols(s string) []int {
    a := fnv.New64a()
    a.Write([]byte(s))
    x := a.Sum64()
    b := fnv.New64()
    b.Write([]byte(s))
    y := b.Sum64() | 1
    cs := make([]int, c.d)
    for i := range cs {
        cs[i] = int((x + uint64(i)*y) % uint64(c.w))
    }
    return cs
}

func (c *CMS) Add(s string) {
    for i, col := range c.cols(s) {
        c.t[i][col]++
    }
}

func (c *CMS) Est(s string) uint64 {
    min := ^uint64(0)
    for i, col := range c.cols(s) {
        if c.t[i][col] < min {
            min = c.t[i][col]
        }
    }
    return min
}

func TopK(stream []string, k int, eps, delta float64) []string {
    cms := NewCMS(eps, delta)
    best := map[string]uint64{} // candidate -> est
    for _, w := range stream {
        cms.Add(w)
        est := cms.Est(w)
        if _, ok := best[w]; ok {
            best[w] = est
        } else if len(best) < k {
            best[w] = est
        } else {
            // find current min candidate
            var mk string
            var mv uint64 = ^uint64(0)
            for kk, vv := range best {
                if vv < mv {
                    mv, mk = vv, kk
                }
            }
            if est > mv {
                delete(best, mk)
                best[w] = est
            }
        }
    }
    keys := make([]string, 0, len(best))
    for kk := range best {
        keys = append(keys, kk)
    }
    sort.Slice(keys, func(i, j int) bool { return best[keys[i]] > best[keys[j]] })
    return keys
}

// (heap import kept to show the production approach; map-based top-k above is O(k) per step)
var _ = heap.Init

func main() {
    stream := []string{"a", "b", "a", "c", "a", "b", "a", "d", "b", "a"}
    fmt.Println(TopK(stream, 2, 0.001, 0.01)) // [a b]
}

Java¶

import java.util.*;
import java.util.zip.CRC32;

public class Solution {
    static int d, w;
    static long[][] t;

    static void init(double eps, double delta) {
        w = (int) Math.ceil(Math.E / eps);
        d = (int) Math.ceil(Math.log(1 / delta));
        t = new long[d][w];
    }

    static int[] cols(String s) {
        long a = Integer.toUnsignedLong(s.hashCode());
        CRC32 c = new CRC32();
        c.update(s.getBytes());
        long b = c.getValue() | 1L;
        int[] cs = new int[d];
        for (int i = 0; i < d; i++) cs[i] = (int) (((a + (long) i * b) % w + w) % w);
        return cs;
    }

    static void add(String s) {
        int[] cs = cols(s);
        for (int i = 0; i < d; i++) t[i][cs[i]]++;
    }

    static long est(String s) {
        int[] cs = cols(s);
        long m = Long.MAX_VALUE;
        for (int i = 0; i < d; i++) m = Math.min(m, t[i][cs[i]]);
        return m;
    }

    static List<String> topK(String[] stream, int k, double eps, double delta) {
        init(eps, delta);
        Map<String, Long> best = new HashMap<>();
        for (String s : stream) {
            add(s);
            long e = est(s);
            if (best.containsKey(s) || best.size() < k) {
                best.put(s, e);
            } else {
                String mk = null; long mv = Long.MAX_VALUE;
                for (Map.Entry<String, Long> en : best.entrySet())
                    if (en.getValue() < mv) { mv = en.getValue(); mk = en.getKey(); }
                if (e > mv) { best.remove(mk); best.put(s, e); }
            }
        }
        List<String> keys = new ArrayList<>(best.keySet());
        keys.sort((x, y) -> Long.compare(best.get(y), best.get(x)));
        return keys;
    }

    public static void main(String[] args) {
        String[] stream = {"a","b","a","c","a","b","a","d","b","a"};
        System.out.println(topK(stream, 2, 0.001, 0.01)); // [a, b]
    }
}

Note: cols(s) is computed once per call and reused across rows; in a hot loop you would cache it rather than recompute per row.

Python¶

import hashlib
import math


class CMS:
    def __init__(self, eps, delta):
        self.w = math.ceil(math.e / eps)
        self.d = math.ceil(math.log(1 / delta))
        self.t = [[0] * self.w for _ in range(self.d)]

    def _cols(self, s):
        h = hashlib.md5(s.encode()).digest()
        a = int.from_bytes(h[:8], "big")
        b = int.from_bytes(h[8:], "big") | 1
        return [(a + i * b) % self.w for i in range(self.d)]

    def add(self, s):
        for i, col in enumerate(self._cols(s)):
            self.t[i][col] += 1

    def est(self, s):
        cols = self._cols(s)
        return min(self.t[i][cols[i]] for i in range(self.d))


def top_k(stream, k, eps=0.001, delta=0.01):
    cms = CMS(eps, delta)
    best = {}  # candidate -> estimate
    for w in stream:
        cms.add(w)
        e = cms.est(w)
        if w in best or len(best) < k:
            best[w] = e
        else:
            mk = min(best, key=best.get)
            if e > best[mk]:
                del best[mk]
                best[w] = e
    return sorted(best, key=best.get, reverse=True)


if __name__ == "__main__":
    stream = ["a", "b", "a", "c", "a", "b", "a", "d", "b", "a"]
    print(top_k(stream, 2))  # ['a', 'b']

Expected output: [a b] / [a, b] / ['a', 'b'] — a (true 5) and b (true 3) are the two heaviest. Validate by also building an exact Counter/map on small inputs and confirming each CMS estimate is ≥ the true count and within eps·N.

Follow-ups an interviewer may ask: - Add conservative update and show estimates drop toward the true counts. - Make the sketch mergeable and merge two partial sketches; confirm the result equals one sketch over the concatenated stream. - Add decay so the top-k reflects a recent window. - Explain why the heap can hold a false positive but never miss an item far above the cutoff.

Worked Whiteboard Example (Talk Through This Out Loud)¶

Interviewers love watching you trace a tiny sketch by hand. Use d = 2, w = 4, and two toy hashes:

h0(x) = (sum of letter positions) mod 4        (a=1, b=2, ..., z=26)
h1(x) = (3*len(x) + first letter position) mod 4

Stream: cat cat dog cat bird dog. Precomputed columns: cat → (0,0), dog → (2,1), bird → (1,2).

After processing all six items the grid is:

row0 (h0): [ 3, 1, 2, 0 ]
row1 (h1): [ 3, 2, 1, 0 ]

Now answer queries out loud, always taking the min:

estimate(cat) = min(row0[0]=3, row1[0]=3) = 3 — true is 3, exact.
estimate(dog) = min(row0[2]=2, row1[1]=2) = 2 — true is 2, exact.
estimate(act) (never inserted; anagram of cat so h0=0, but h1(act)=(9+1)%4=2): min(row0[0]=3, row1[2]=1) = 1 — true is 0, reported 1: a pure collision overcount, but the min pulled it from 3 down to 1.

The point to verbalize: every real occurrence of an item bumps all its cells, so each cell is ≥ true; the min is the least-polluted of them, still ≥ true. That single sentence is the whole correctness story.

Common Interview Pitfalls to Avoid¶

Pitfall	Why it's wrong	What to say instead
"Take the average of the cells."	Average is not a valid upper bound; it breaks the overcount-only guarantee.	"Take the minimum — it's the least-polluted cell and stays `≥` true."
"Use `d` rows with one hash function."	Every row collides identically; depth gives nothing.	"Each row needs an independent hash (or double-hashing `a + i·b`)."
"CMS can list the top keys."	CMS only does point queries; it stores no keys.	"Pair it with a min-heap to discover heavy hitters."
"Subtract to delete an item."	Subtraction breaks the min / no-undercount invariant and can underflow.	"Plain CMS has no delete; use a Count Sketch (signed) for turnstile streams."
"Error is fixed."	Absolute error is `eps·N` and grows with stream volume.	"Error grows with `N`; long streams need windowing or decay."
"Bigger `d` lowers the error a lot."	Depth mainly lowers the failure probability `(1/e)^d`, not the error size.	"Width fights error magnitude; depth fights failure odds."

Second Coding Challenge (3 Languages)¶

Challenge: Conservative-Update Sketch with Error Verification¶

Implement a CMS with both add (plain) and conservativeAdd. Stream a skewed dataset, then for one chosen key print the plain estimate, the conservative estimate, and the true count, asserting both estimates are ≥ true and conservative ≤ plain.

Go¶

package main

import (
    "fmt"
    "hash/fnv"
)

type S struct{ d, w int; t [][]uint64 }

func NewS(d, w int) *S {
    t := make([][]uint64, d)
    for i := range t { t[i] = make([]uint64, w) }
    return &S{d, w, t}
}
func (s *S) cols(k string) []int {
    a := fnv.New64a(); a.Write([]byte(k)); x := a.Sum64()
    b := fnv.New64(); b.Write([]byte(k)); y := b.Sum64() | 1
    c := make([]int, s.d)
    for i := range c { c[i] = int((x + uint64(i)*y) % uint64(s.w)) }
    return c
}
func (s *S) est(c []int) uint64 {
    m := ^uint64(0)
    for i, col := range c { if s.t[i][col] < m { m = s.t[i][col] } }
    return m
}
func (s *S) Add(k string)            { for i, c := range s.cols(k) { s.t[i][c]++ } }
func (s *S) ConsAdd(k string) {
    c := s.cols(k); tgt := s.est(c) + 1
    for i, col := range c { if s.t[i][col] < tgt { s.t[i][col] = tgt } }
}
func (s *S) Est(k string) uint64 { return s.est(s.cols(k)) }

func main() {
    plain, cons := NewS(4, 64), NewS(4, 64)
    stream := []string{}
    for i := 0; i < 50; i++ { stream = append(stream, "hot") }   // elephant
    for i := 0; i < 5; i++ { stream = append(stream, fmt.Sprintf("m%d", i)) } // mice
    stream = append(stream, "mouse")
    truth := map[string]int{}
    for _, k := range stream { plain.Add(k); cons.ConsAdd(k); truth[k]++ }
    fmt.Printf("mouse: true=%d plain=%d cons=%d\n", truth["mouse"], plain.Est("mouse"), cons.Est("mouse"))
    // conservative should be <= plain and both >= true(=1)
}

Java¶

import java.util.*;
import java.util.zip.CRC32;

public class Conservative {
    int d, w; long[][] t;
    Conservative(int d, int w){ this.d=d; this.w=w; this.t=new long[d][w]; }
    int[] cols(String k){
        long a = Integer.toUnsignedLong(k.hashCode());
        CRC32 c = new CRC32(); c.update(k.getBytes());
        long b = c.getValue() | 1L;
        int[] cs = new int[d];
        for(int i=0;i<d;i++) cs[i]=(int)(((a+(long)i*b)%w+w)%w);
        return cs;
    }
    long est(int[] cs){ long m=Long.MAX_VALUE; for(int i=0;i<d;i++) m=Math.min(m,t[i][cs[i]]); return m; }
    void add(String k){ int[] cs=cols(k); for(int i=0;i<d;i++) t[i][cs[i]]++; }
    void consAdd(String k){ int[] cs=cols(k); long tg=est(cs)+1; for(int i=0;i<d;i++) if(t[i][cs[i]]<tg) t[i][cs[i]]=tg; }
    long estimate(String k){ return est(cols(k)); }

    public static void main(String[] a){
        Conservative plain=new Conservative(4,64), cons=new Conservative(4,64);
        List<String> stream=new ArrayList<>();
        for(int i=0;i<50;i++) stream.add("hot");
        for(int i=0;i<5;i++) stream.add("m"+i);
        stream.add("mouse");
        Map<String,Integer> truth=new HashMap<>();
        for(String k:stream){ plain.add(k); cons.consAdd(k); truth.merge(k,1,Integer::sum); }
        System.out.printf("mouse: true=%d plain=%d cons=%d%n",
            truth.get("mouse"), plain.estimate("mouse"), cons.estimate("mouse"));
    }
}

Python¶

import hashlib


class S:
    def __init__(self, d, w):
        self.d, self.w = d, w
        self.t = [[0] * w for _ in range(d)]

    def _cols(self, k):
        h = hashlib.md5(k.encode()).digest()
        a = int.from_bytes(h[:8], "big")
        b = int.from_bytes(h[8:], "big") | 1
        return [(a + i * b) % self.w for i in range(self.d)]

    def _est(self, cols):
        return min(self.t[i][cols[i]] for i in range(self.d))

    def add(self, k):
        for i, c in enumerate(self._cols(k)):
            self.t[i][c] += 1

    def cons_add(self, k):
        cols = self._cols(k)
        tgt = self._est(cols) + 1
        for i, c in enumerate(cols):
            if self.t[i][c] < tgt:
                self.t[i][c] = tgt

    def est(self, k):
        return self._est(self._cols(k))


if __name__ == "__main__":
    plain, cons = S(4, 64), S(4, 64)
    stream = ["hot"] * 50 + [f"m{i}" for i in range(5)] + ["mouse"]
    truth = {}
    for k in stream:
        plain.add(k)
        cons.cons_add(k)
        truth[k] = truth.get(k, 0) + 1
    t = truth["mouse"]
    p, c = plain.est("mouse"), cons.est("mouse")
    print(f"mouse: true={t} plain={p} cons={c}")
    assert c <= p and p >= t and c >= t, "invariants violated"

Expected behavior: mouse has true count 1; both estimates are ≥ 1; the conservative estimate is ≤ the plain estimate (often strictly less when mouse collides with the hot elephant). This is the empirical demonstration of the sandwich true ≤ conservative ≤ plain proved in professional.md.

Final Tips¶

Lead with the min and the overcount-only invariant — it's the one idea everything else hangs on.
Be ready to size the grid on the spot: w = e/eps, d = ln(1/delta), and explain why width fights error size and depth fights failure odds.
Know the proof skeleton: Markov on one row (1/e) × d independent rows ((1/e)^d ≤ delta).
Always name the direction of error when comparing: CMS over, Misra-Gries under, Bloom false-positive, HLL distinct-count.
Mention conservative update and CMS+heap as the two upgrades that turn the toy into a production tool.

Count-Min Sketch — Interview Preparation¶

Quick-Reference Cheat Sheet¶

Junior Questions¶

Middle Questions¶

Senior Questions¶

Professional Questions¶

Coding Challenge (3 Languages)¶

Challenge: Top-K Frequent Words via Count-Min Sketch + Heap¶

Go¶

Java¶

Python¶

Worked Whiteboard Example (Talk Through This Out Loud)¶

Common Interview Pitfalls to Avoid¶

Second Coding Challenge (3 Languages)¶

Challenge: Conservative-Update Sketch with Error Verification¶

Go¶

Java¶

Python¶

Final Tips¶

Next step: tasks.md — graded practice tasks (beginner, intermediate, advanced) in Go, Java, and Python.¶