Approximation and Hardness — Interview Questions¶
Table of Contents¶
- Conceptual Questions
- Applied: Proving Approximation Ratios
- Gotcha / Trick Questions
- Rapid-Fire Q&A
- Common Mistakes
- Tips & Summary
Conceptual Questions¶
Q1: Define an α-approximation algorithm. Mind the min/max direction. (Medium — the #1 slip)¶
Answer: An α-approximation algorithm runs in polynomial time and returns a solution provably within a factor α of the optimum OPT, on every input. The inequality direction flips with the objective:
- Minimization (e.g., Vertex Cover, Set Cover, TSP):
ALG ≤ α · OPT, withα ≥ 1. Smallerαis better;α = 1is exact. - Maximization (e.g., MAX-CUT, MAX-3SAT):
ALG ≥ α · OPT, withα ≤ 1. Largerαis better;α = 1is exact.
The classic slip is writing ALG ≤ α · OPT for a maximization problem — that bound is trivially true and meaningless. Always ask "which side am I bounding?" The guarantee is worst-case: it holds for all inputs, not on average. (Some authors normalize maximization ratios to ≥ 1 too — state your convention.)
Q2: What are PTAS and FPTAS, and how do they differ? (Medium)¶
Answer: Both are families of algorithms parameterized by an accuracy ε > 0, producing a (1 + ε)-approximation (min) or (1 − ε)-approximation (max).
- PTAS (Polynomial-Time Approximation Scheme): for each fixed
ε, runs in time polynomial inn. The catch —εmay appear in the exponent, e.g.O(n^{1/ε}). So accuracy gets arbitrarily good but the cost explodes asε → 0. - FPTAS (Fully Polynomial-Time Approximation Scheme): runs in time polynomial in both
nand1/ε, e.g.O(n³/ε). This is the strongest practical guarantee short of exact poly time.
Difference in one line: PTAS allows 1/ε in the exponent; FPTAS forbids it (it must appear polynomially). Knapsack has an FPTAS; many geometric problems have only a PTAS.
Q3: Order the approximation classes FPTAS, PTAS, APX. (Medium)¶
Answer: For minimization problems with bounded objective,
- FPTAS: approximable to any
1 + εwith cost poly innand1/ε. - PTAS: approximable to any
1 + ε, cost poly innper fixedε. - APX: approximable to some constant factor
α > 1, but not necessarily to every1 + ε.
The inclusions are strict unless P = NP. Vertex Cover ∈ APX (2-approx) but ∉ PTAS (it's APX-hard). Knapsack ∈ FPTAS. Set Cover ∉ APX (its best ratio is ln n, which grows with n). See ./middle.md for the full hierarchy.
Q4: What does "APX-hard" / "no PTAS unless P=NP" mean? (Hard)¶
Answer: A problem is APX-hard if every APX problem reduces to it via an approximation-preserving reduction (a PTAS reduction / L-reduction). The key consequence: an APX-hard problem has no PTAS unless P = NP. Intuitively, there is a constant threshold below which approximation becomes NP-hard — you can get within some fixed α, but driving the ratio to 1 + ε for arbitrarily small ε is as hard as solving the problem exactly.
So "Vertex Cover is APX-hard" means: a 2-approximation exists, but no scheme can do 1.0001-approximation in poly time (unless P = NP). APX-hardness is the approximation analogue of NP-hardness — it pins a hardness of approximation floor. See ../07-reductions-and-np-completeness/interview.md for the reduction machinery it builds on.
Q5: How do you prove a LOWER bound on approximability? Sketch the gap-reduction idea. (Hard)¶
Answer: You prove inapproximability with a gap-producing (gap) reduction. The idea: reduce an NP-hard decision problem to your optimization problem so that
- "yes" instances map to instances with
OPT ≥ c, and - "no" instances map to instances with
OPT ≤ s, withs < c.
If a poly-time α-approximation with α < c/s existed, it could distinguish the two cases — its output would fall in different ranges — thereby deciding the original NP-hard problem in poly time. Contradiction (unless P = NP). The ratio c/s is the gap, and it becomes the inapproximability factor.
The deep engine that manufactures such gaps from scratch is the PCP theorem (next question). Without PCP, most constant-factor lower bounds are simply unknown.
Q6: State the PCP theorem and why it matters for approximation. (Hard)¶
Answer: The PCP theorem (Arora–Safra, Arora–Lund–Motwani–Sudan–Szegedy, 1992) states:
NP = PCP(O(log n), O(1))— every NP problem has a proof that a verifier can check by reading only a constant number of randomly chosen bits, usingO(log n)random bits, accepting correct proofs always and incorrect proofs with probability< 1/2.
Why it matters: PCP is equivalent to the existence of a constant gap for MAX-3SAT — it is NP-hard to distinguish formulas where all clauses are satisfiable from those where only a (1 − δ) fraction is. This single gap, propagated through approximation-preserving reductions, yields hardness-of-approximation results across the board (Set Cover ln n, MAX-3SAT 7/8, etc.). PCP is to inapproximability what Cook–Levin is to NP-completeness: the root that bootstraps every gap. See ./senior.md.
Applied: Proving Approximation Ratios¶
Q7: Prove the Vertex Cover 2-approximation. (Medium — classic whiteboard)¶
Answer: Algorithm: find any maximal matching M (greedily pick edges with no shared endpoint until none remain); output all endpoints of M.
It's a valid cover: if some edge e were uncovered, neither endpoint is in M's endpoints, so e shares no vertex with any matched edge — we could add e to M, contradicting maximality. So every edge is covered.
The ratio is 2: the |M| matching edges are vertex-disjoint, and any vertex cover must include at least one endpoint of each — so OPT ≥ |M|. Our cover uses exactly 2|M| vertices. Therefore
It's a clean 2-approximation, O(E) time. Note the proof never computes OPT — it lower-bounds it via the matching. That lower-bounding move is the heart of nearly every approximation proof.
Q8: Why is greedy Set Cover an H_n ≈ ln n-approximation? (Hard)¶
Answer: Greedy: repeatedly pick the set covering the most still-uncovered elements, until all n are covered.
Charging argument: when an element is first covered by a chosen set covering k new elements, charge it cost 1/k. The total cost equals the number of sets chosen. Consider the elements in the order greedy covers them. At the step where r elements remain uncovered, the optimal solution covers all r with OPT sets, so some available set covers ≥ r/OPT of them — greedy picks one at least that good, charging each newly covered element ≤ OPT/r. Summing over all n elements in reverse coverage order gives
So greedy is an H_n-approximation. Remarkably, this is essentially optimal: it is NP-hard to approximate Set Cover within (1 − o(1)) ln n (Dinur–Steurer 2014, via PCP), so the simple greedy is the best you can hope for asymptotically.
Q9: Give the metric-TSP MST-doubling 2-approximation. (Medium)¶
Answer: Assume the metric TSP (edge weights satisfy the triangle inequality). Algorithm:
- Build a minimum spanning tree
Tof the complete graph. - Double every edge of
Tto get an Eulerian multigraph; take an Euler tour. - Shortcut repeated vertices (skip already-visited ones) to get a Hamiltonian tour. Shortcutting only shrinks length by the triangle inequality.
Ratio: any tour minus one edge is a spanning tree, so MST ≤ OPT. Doubling gives 2·MST, and shortcutting can't increase it:
A 2-approximation. Christofides improves this to 3/2 by adding a minimum-weight perfect matching on the odd-degree vertices of T (instead of doubling all edges), producing a cheaper Eulerian graph. Both require the triangle inequality — see Q13.
Q10: Sketch the Knapsack FPTAS. (Hard)¶
Answer: 0/1 Knapsack has an exact DP that is O(n · V) where V is the total value — pseudo-polynomial, since V can be exponential in the input bit-length. The FPTAS trades a little accuracy for genuine poly time by scaling down the values:
- Let
v_maxbe the largest item value. Choose scaling factorK = ε · v_max / n. - Round each value down:
v'_i = ⌊v_i / K⌋. - Run the exact value-based DP on the small
v'_i. The DP now runs inO(n² / ε)because the scaled total value isO(n / ε).
Why it's (1 − ε): each item loses at most K in value to rounding; the optimal solution has ≤ n items, so the chosen set loses at most n·K = ε·v_max ≤ ε·OPT. Hence ALG ≥ (1 − ε)·OPT, in time poly in n and 1/ε — an FPTAS. (Knapsack is only weakly NP-hard, which is exactly why an FPTAS is possible; strongly NP-hard problems admit no FPTAS unless P = NP.)
Gotcha / Trick Questions¶
Q11: "Is a 2-approximation always within 2× in practice?" (Medium)¶
Answer: The factor 2 is a worst-case guarantee, not the typical result. On most real inputs a "2-approximation" lands far closer to OPT — often within a few percent — because the worst case requires an adversarial structure that rarely occurs. The bound is a promise ("never worse than 2×"), not a prediction ("about 2× on average"). Conversely, the bound is tight: there exist instances forcing the algorithm to nearly 2·OPT (e.g., Vertex Cover on a perfect matching of disjoint edges, where greedy-matching takes all 2|M| vertices but OPT = |M|). Worst-case ratio and empirical performance are different axes — say so.
Q12: "Does an approximation algorithm ever return the exact optimum?" (Easy)¶
Answer: Yes — frequently — but with no guarantee. Nothing stops a 2-approximation from outputting OPT on a given input; the ratio only caps how bad it can get. What it cannot do is certify optimality: you usually can't tell, in poly time, whether a returned solution happens to be optimal (that would often be NP-hard itself). So "approximation" bounds the error ceiling; it never forbids being exactly right. The two notions — being optimal vs knowing you are — are distinct.
Q13: Can we 2-approximate general (non-metric) TSP? (Medium)¶
Answer: No — general TSP has no constant-factor approximation unless P = NP. A poly-time α-approximation for arbitrary edge weights could be used to decide Hamiltonian Cycle: build a graph with weight 1 on existing edges and a huge weight M ≫ α·n on non-edges; a Hamiltonian cycle gives a tour of cost n, while any non-Hamiltonian tour costs ≥ M. An α-approximation would separate the cases, deciding an NP-complete problem. The triangle inequality is precisely what rescues approximability — that's why MST-doubling and Christofides require the metric assumption.
Q14: "Why can't every NP-hard problem be approximated well?" (Hard)¶
Answer: Because approximating some problems is itself NP-hard — proven via PCP / gap reductions. PCP creates a constant gap (e.g., distinguishing fully-satisfiable MAX-3SAT from 7/8-satisfiable is NP-hard), and approximation-preserving reductions spread that gap to other problems. The result is a spectrum of approximability:
- FPTAS (Knapsack) — approximable arbitrarily well.
- APX (Vertex Cover, metric TSP) — constant factor, no PTAS.
- Logarithmic (Set Cover) — best ratio grows as
ln n. - No constant factor / essentially inapproximable (general TSP, MAX-Clique within
n^{1−ε}).
"NP-hard to solve" and "NP-hard to approximate" are separate facts; PCP is what lets us prove the latter.
Q15: MAX-CUT's 0.878 — where does it come from and why might it be optimal? (Hard)¶
Answer: The Goemans–Williamson algorithm solves a semidefinite-programming (SDP) relaxation of MAX-CUT, then rounds by picking a random hyperplane through the origin to split the unit vectors into two sides. Analyzing the expected number of cut edges yields the constant
so E[ALG] ≥ 0.878 · OPT — a 0.878-approximation (maximization, so ≥). Why might it be optimal? Under the Unique Games Conjecture (UGC), it is NP-hard to approximate MAX-CUT within any factor better than α_GW (Khot–Kindler–Mossel–O'Donnell). So if UGC holds, GW is the best possible poly-time algorithm — the SDP rounding exactly matches the hardness threshold. UGC similarly implies Vertex Cover is hard to approximate within 2 − ε, matching the trivial 2-approximation.
Rapid-Fire Q&A¶
| # | Question | Answer |
|---|---|---|
| 1 | α-approx, minimization bound? | ALG ≤ α · OPT, α ≥ 1 |
| 2 | α-approx, maximization bound? | ALG ≥ α · OPT, α ≤ 1 |
| 3 | PTAS vs FPTAS difference? | FPTAS is poly in 1/ε too; PTAS may have 1/ε in the exponent |
| 4 | Class containments? | FPTAS ⊆ PTAS ⊆ APX (strict unless P = NP) |
| 5 | Vertex Cover ratio (matching)? | 2 (all endpoints of a maximal matching) |
| 6 | Why OPT ≥ |M| for VC? | Each matching edge needs a distinct cover vertex |
| 7 | Greedy Set Cover ratio? | H_n ≈ ln n |
| 8 | Metric TSP MST-doubling ratio? | 2 |
| 9 | Christofides ratio? | 3/2 |
| 10 | Knapsack approximation class? | FPTAS, O(n²/ε) via value scaling |
| 11 | MAX-3SAT inapproximability? | NP-hard beyond 7/8 (PCP) |
| 12 | Set Cover lower bound? | (1 − o(1)) ln n, NP-hard |
| 13 | MAX-CUT GW ratio? | ≈ 0.878 (SDP + hyperplane rounding) |
| 14 | UGC ⇒ Vertex Cover hardness? | No 2 − ε approximation |
| 15 | What does PCP give us? | A constant gap ⇒ hardness of approximation |
| 16 | General (non-metric) TSP? | No constant-factor approx unless P = NP |
| 17 | APX-hard means? | No PTAS unless P = NP |
Common Mistakes¶
- Flipping the min/max inequality. Minimization bounds above (
ALG ≤ α·OPT); maximization bounds below (ALG ≥ α·OPT). Writing≤for a max problem gives a vacuous bound. - Treating the ratio as typical, not worst-case. A 2-approximation guarantees
≤ 2·OPTalways; it usually does far better in practice. The bound is a ceiling, not an average. - Confusing PTAS and FPTAS. Only FPTAS is polynomial in
1/ε. Knapsack has an FPTAS; many geometric problems have only a PTAS. - Applying MST-doubling / Christofides to non-metric TSP. Both require the triangle inequality; general TSP has no constant-factor approximation.
- Saying "approximation never finds the optimum." It often does — it just can't guarantee or certify it.
- Forgetting to lower-bound
OPT. Every ratio proof hinges on a poly-computable bound onOPT(matching size, MST weight) — you never computeOPTdirectly. - Assuming all NP-hard problems are equally (in)approximable. They span FPTAS → APX →
ln n→ no-constant; PCP/UGC pin down each floor.
Tips & Summary¶
- Fix the direction first. Before any approximation argument, write down whether it's min (
ALG ≤ α·OPT) or max (ALG ≥ α·OPT). Half of all approximation slips are a flipped inequality. - Memorize the canonical ratios: Vertex Cover 2 (maximal matching), Set Cover
ln n(greedy), metric TSP 2 (MST-doubling) / 3/2 (Christofides), Knapsack FPTAS, MAX-CUT 0.878 (GW), MAX-3SAT 7/8. These are the constants interviewers expect on sight. - Lead every ratio proof by lower-bounding
OPTwith something you can compute — a matching, an MST, an LP/SDP relaxation. That single move is the skeleton of the proof. - Name the hardness engine: PCP manufactures the constant gap; gap/PTAS reductions spread it; UGC sharpens specific thresholds (MAX-CUT 0.878, Vertex Cover
2 − ε). Mentioning PCP and UGC signals depth. - Separate "worst-case ratio" from "empirical performance" explicitly — and "being optimal" from "certifying optimality." Both distinctions catch out weaker candidates.
- Tie approximability to NP-hardness type: weakly NP-hard (Knapsack) → FPTAS possible; strongly NP-hard → no FPTAS. It connects this topic back to ../07-reductions-and-np-completeness/interview.md.
Related: Reductions & NP-Completeness — Interview · Complexity Classes P and NP — Interview · Approximation & Hardness — Middle · Approximation & Hardness — Senior
In this topic
- interview
- tasks