Online Scheduling and Load Balancing — Senior Level¶
Prerequisites¶
- Middle Level — Graham's list scheduling (
2 − 1/m-competitive), LPT, the greedy load-balancing lower bound, and online bin packing First-Fit / Next-Fit / First-Fit-Decreasing with their1.7·OPT + O(1)and2·OPTbounds - Competitive Analysis — Senior — the adversary hierarchy, the potential method, and Yao's principle as von Neumann minimax (the lower-bound engine reused here)
Table of Contents¶
- What Senior-Level Scheduling Theory Is About
- Online Makespan, Refined: Beating 2 on Identical Machines
- The 1.9201 Lower Bound: Construction Sketch
- Randomized Makespan
- Related and Unrelated Machines: Θ(log m)
- Worked Proof: Greedy Is O(log m) on Unrelated Machines
- Flow Time and the Necessity of Resource Augmentation
- Speed Augmentation and Scalable Algorithms
- Online Bin Packing, Advanced: Harmonic to 1.578
- Vector Bin Packing and VM Placement
- Connections: k-Server, MTS, and Rent-or-Buy
- Learning-Augmented Scheduling
- Decision Framework
- Research Pointers
- Key Takeaways
What Senior-Level Scheduling Theory Is About¶
The middle level settles the easy questions. Graham's greedy list scheduling — assign each arriving job to the least-loaded machine — is (2 − 1/m)-competitive for makespan on m identical machines, and a matching lower bound shows greedy cannot do better. LPT (longest-processing-time first) tightens the offline/sorted case to 4/3 − 1/(3m). Online bin packing sits at First-Fit 1.7·OPT + O(1), Next-Fit 2·OPT, with the easy 4/3 lower bound. Those bounds are correct and, for an interview, sufficient. They are also where the interesting theory begins, not ends.
Senior-level scheduling theory is about four deeper truths, each a different relaxation of the naive worst case:
- Beating Graham's
2 − 1/mrequires deliberate imbalance. The best deterministic online makespan algorithms for identical machines sit at≈ 1.92(Fleischer–Wahl 1.9201, Albers 1.923), strictly below2, and the best lower bound is≈ 1.9201(Rudin). The key structural insight: an algorithm that always balances perfectly (pure greedy) is provably stuck near2; to beat it you must keep some machines underloaded on purpose, hedging against a future burst of large jobs. This is the rare case where the locally greedy move is globally wrong. - Machine heterogeneity costs a
log mfactor. On unrelated machines (jobjtakes timep_{ij}on machinei, with no structure across machines) greedy isΘ(log m)-competitive —O(log m)from above (Aspnes, Azar, Fiat, Plotkin, Waarts 1997) and a matchingΩ(log m)lower bound. No online algorithm escapes the logarithm. - Some objectives admit no constant-competitive online algorithm — until you cheat with resources. Minimizing total/average flow time or weighted completion time has an unbounded competitive ratio online. But give the online scheduler faster machines than OPT — speed augmentation — and SRPT / SETF become
O(1)-competitive (Kalyanasundaram–Pruhs 1995/2000). This resource-augmentation framework, and the notion of a scalable algorithm(1 + ε)-speedO(1)-competitive, is the senior centerpiece. - Bin packing's true constant is now nailed down to a narrow window. The Super-Harmonic / Harmonic++ family pushed the upper bound below
1.59, and Balogh, Békési, Dósa, Epstein, and Levin (2018) reached≈ 1.578with a matching-direction lower bound of≈ 1.5403. Multi-dimensional (vector) bin packing — the abstraction behind VM placement — is genuinely harder.
Underneath sits the same machinery as the rest of ../01-competitive-analysis/senior.md: potentials for upper bounds, Yao and explicit adversary constructions for lower bounds, and the connection to ../04-k-server-problem/senior.md and metrical task systems.
Online Makespan, Refined: Beating 2 on Identical Machines¶
Graham's analysis is exact for pure greedy: the makespan it produces is at most OPT·(2 − 1/m), and an adversary achieves it (release m(m−1) unit jobs to balance everything perfectly, then one job of size m). The natural senior question is: can any deterministic online algorithm beat 2 − 1/m? The answer is yes — but the algorithm must abandon perfect balancing.
Why pure greedy is stuck near 2¶
Greedy's failure mode is that it leaves no slack. After greedy has balanced m machines to load L, a single huge job of size ≈ L must pile onto some machine, producing makespan ≈ 2L, while OPT — knowing the huge job was coming — would have reserved a lightly loaded machine for it. Greedy's commitment to immediate balance is exactly what the adversary punishes.
The imbalance idea¶
The breakthrough algorithms (Bartal, Fiat, Karloff, Vohra 1992, the first sub-2 algorithm at 2 − 1/70; then Karger–Phillips–Torng, Albers, Fleischer–Wahl) deliberately maintain an uneven load profile: they keep a designated fraction of machines below average so that a future large job has a cheap home. Concretely, an algorithm assigns an arriving job to a machine only if doing so keeps the machine's load below α · (current average or lower-bound estimate of OPT); otherwise it "wastes" the more-loaded machines and uses a reserved light one. The competitive ratio is tuned by choosing the imbalance threshold α.
Theorem (Fleischer–Wahl 2000). There is a deterministic online algorithm for makespan on
midentical machines that is≈ 1.9201-competitive asm → ∞.Theorem (Albers 1999). There is a deterministic online algorithm that is
1.923-competitive for allm, together with an improved lower bound (Albers' analysis brought both ends closer).
The precise constant 1.9201 of Fleischer–Wahl matches the lower bound of ≈ 1.9201 (the bound usually attributed to Rudin's thesis, Improved Bounds for the Online Scheduling Problem, 2001, sharpening Bartal–Karloff–Rabani and Albers). So for deterministic identical-machine makespan, the competitive ratio is pinned to the interval [1.9201, 1.9201] up to the precision of these constants — the problem is, for practical purposes, solved.
The senior takeaway is structural, not numeric: the only way past 2 is to forgo greedy's instantaneous balance and bank capacity for an adversarial large job. This is the canonical example where "always do the locally optimal load-balancing move" is provably suboptimal online.
The 1.9201 Lower Bound: Construction Sketch¶
The lower bound for deterministic identical-machine makespan is an explicit adversary, in the spirit of the constructions in ../01-competitive-analysis/senior.md. We sketch its skeleton for m → ∞; the full optimization yields ≈ 1.9201.
The adversary's weapon: stop early if the algorithm is balanced¶
The adversary releases jobs in rounds of geometrically increasing size, and watches the algorithm's load vector. The principle:
- The adversary feeds many small jobs first. By the time it has released enough small work, the algorithm has committed each piece to some machine.
- At a moment of the adversary's choosing, it stops and the makespan is the current maximum machine load. If the algorithm balanced perfectly, the adversary continues with a batch of larger jobs; if the algorithm hedged (kept some machine light), the adversary stops now and the algorithm's wasted, over-loaded machines hurt its makespan relative to a balanced OPT.
The algorithm faces a dilemma the adversary engineered: balance now and be punished by a future large job, or hedge now and be punished if the adversary stops immediately. Either response is suboptimal against the right stopping time, and the adversary picks the worse one for the algorithm.
The optimization¶
Formally, let the adversary release jobs in phases. In phase t, it releases jobs of size s_t (geometric, s_t = γ^t) until the algorithm's load configuration reaches a target. After each phase the adversary computes two quantities:
- the online makespan
ALG_tif it stops here (max load over the algorithm's machines), and - the optimal makespan
OPT_tfor the jobs released so far (a balanced offline assignment of all phases up tot).
The adversary stops at the phase t* maximizing ALG_{t*} / OPT_{t*}. The construction is arranged so that whatever online assignment policy is used, some phase exhibits a ratio approaching the target. Setting up the recurrence between consecutive phases' loads and solving the resulting optimization (a small LP / fixed-point in the geometric ratio γ and the per-phase job counts) yields the value:
The earliest such bound (Faigle–Kern–Turán) gave 1 + 1/√2 ≈ 1.707 for small m; Bartal–Karloff–Rabani pushed it to 1.837; Albers and then Rudin's optimization reached 1.9201. The matching upper bound (Fleischer–Wahl) is what closes the problem. The technique — a geometric-phase adversary with an optimal stopping time chosen to maximize the realized ratio — is the standard tool for online-scheduling lower bounds and recurs in the flow-time impossibility below.
Randomized Makespan¶
Against an oblivious adversary, randomization helps for makespan, though far less dramatically than it does for paging (k → H_k). The randomization lets the algorithm hide which machine it kept light, denying the adversary the precise stopping time it needs.
- For
m = 2machines, the optimal randomized competitive ratio is4/3(Bartal–Fiat–Karloff–Vohra), versus the deterministic3/2 = 2 − 1/2. A randomized lower bound for generalmofe/(e−1) ≈ 1.5819holds for randomized makespan (Chen–van Vliet–Woeginger; Sgall) — note thee/(e−1)constant, the same signature value from randomized ski rental and online matching. - For large
m, the best randomized upper bounds (e.g. Albers' randomized algorithm) sit below the deterministic1.9201but above thee/(e−1) ≈ 1.582lower bound; the exact randomized constant for generalmremains open, a rare gap in an otherwise tightly characterized problem.
The senior framing matches the adversary-hierarchy prediction: randomization buys an order-of-magnitude win only when the deterministic ratio is large (paging, k); for makespan, where the deterministic ratio is already a small constant, randomization shaves the constant from ≈ 1.92 toward ≈ 1.58 and no further.
Related and Unrelated Machines: Θ(log m)¶
Identical machines are the easy case. Heterogeneity changes the complexity class of the problem.
The three machine models¶
| Model | Job j on machine i takes | Structure |
|---|---|---|
| Identical | p_j (same everywhere) | none — load = sum of sizes |
| Related (uniform) | p_j / s_i (machine i has speed s_i) | one speed scalar per machine |
| Unrelated | p_{ij} (arbitrary matrix) | no cross-machine relation |
| Restricted assignment | p_j if i ∈ S_j, else ∞ | each job has an allowed set of machines |
Restricted assignment is the special case of unrelated where each entry is either the job's "real" size or infinity — and it is already hard.
The results¶
Theorem (Aspnes, Azar, Fiat, Plotkin, Waarts 1997). For online makespan on unrelated machines, the greedy "assign to the machine of least resulting load" algorithm is
O(log m)-competitive, and a smarter exponential-potential algorithm achievesO(log m). No online algorithm is better thanΩ(log m)-competitive.Restricted assignment is Θ(log m). Azar, Naor, and Rom showed greedy is
Θ(log m)for the restricted-assignment special case — the lower bound already bites when entries are just{p_j, ∞}.Related machines are O(1). With a single speed per machine, the structure is enough: there are deterministic
O(1)-competitive algorithms (Aspnes et al. give a constant; later work tightened it). The logarithm is a price specifically of unstructured heterogeneity.
The algorithmic idea behind the O(log m) upper bound is the exponential (potential / multiplicative-weights-flavored) assignment rule: instead of minimizing the new max load, assign job j to the machine minimizing the increase in Σ_i a^{load_i} for a base a ≈ 1 + 1/something. This convex potential penalizes piling onto an already-loaded machine super-linearly, smoothing the load and capping the ratio at O(log m). The same exponential-potential trick underlies online routing and virtual-circuit assignment (Aspnes et al.'s actual motivation), and the assignment-LP rounding view (assign fractionally by an LP, round) gives an alternative route to the same O(log m).
Worked Proof: Greedy Is O(log m) on Unrelated Machines¶
We prove the upper-bound half via the exponential potential — the cleanest senior-level argument and the template for the whole AAFPW line.
Setup¶
Jobs arrive online; job j on machine i adds load p_{ij}. Let L_i(t) be the load of machine i after the first t jobs. Normalize so that the optimal makespan satisfies OPT ≤ 1 (we can guess and doubling-scale on a binary-search estimate of OPT, losing only a constant factor; assume the guess is correct). Define the potential after t jobs, for a base a > 1 to be fixed:
Φ(0) = 0. The assignment rule: place job j on the machine i minimizing the increase a^{L_i + p_{ij}} − a^{L_i} in Φ.
The invariant¶
Claim. With base
a = 2(any constant> 1works with a different constant in the bound), the greedy-by-potential rule keepsΦ(t) ≤ (a − 1)·m·(something)such that no machine's load exceedsO(log m)·OPT.
Charging against OPT. Fix the optimal schedule OPT, which assigns job j to some machine σ(j) with p_{σ(j),j} ≤ OPT ≤ 1, and whose per-machine loads are all ≤ OPT ≤ 1. When our algorithm places job j, by the greedy choice its increase in Φ is at most the increase OPT would incur by placing j on σ(j):
Using p_{σ(j),j} ≤ 1 and the convexity bound a^{p} − 1 ≤ p·(a − 1) for p ∈ [0,1]:
Summing¶
Sum over all jobs j. Group the right-hand side by the machine i = σ(j) that OPT used. Because OPT's total load on machine i is Σ_{j: σ(j)=i} p_{ij} ≤ OPT ≤ 1, and the factor a^{L_{σ(j)}} is bounded by a^{ALG's load on i at that time} ≤ a^{(final load on i)}, a careful accounting (the standard AAFPW summation) gives:
Φ(final) = Σ_j ΔΦ_alg(j) ≤ (a − 1) · Σ_i a^{L_i^{final}} · (OPT load on i)
≤ (a − 1) · Φ(final) + (lower-order),
which — solving for Φ(final) with a chosen so that a − 1 < 1 absorbs the self-reference — bounds Φ(final) = O(m). Since Φ(final) ≥ a^{max_i L_i} − 1, we get
The makespan is O(log m)·OPT. The exponential potential is doing the essential work: it makes piling onto a loaded machine exponentially expensive, so the greedy-by-potential rule self-limits the maximum load to a logarithm. The matching Ω(log m) lower bound (AAFPW) is an adversary that forces any online algorithm to spread a structured set of jobs across machines such that some machine accrues Ω(log m) times OPT — restricted-assignment instances suffice.
Flow Time and the Necessity of Resource Augmentation¶
Makespan rewards balance; flow time rewards speed of completion. The flow (or response) time of a job is completion − release; we minimize total flow time Σ_j (C_j − r_j), or its average. This objective is brutal online.
Theorem (impossibility). No deterministic or randomized online algorithm for total flow time on a single machine — or on parallel machines — is
O(1)-competitive. The competitive ratio is unbounded (grows withnor the size ratios); on multiple machines evenSRPTisΩ(log(P))or worse, wherePis the max/min job-size ratio.
The intuition: a single long job, if scheduled at the wrong moment, blocks a flood of tiny jobs that each accrue large flow time while waiting; the offline optimum, knowing the flood is coming, finishes the long job earlier or defers it. The online algorithm cannot win this race in the worst case — there is genuinely no constant-competitive online algorithm. This is the senior insight that forces a change of framework: for flow time, competitive analysis against an equal-resource OPT is the wrong question. The fix is resource augmentation.
Speed Augmentation and Scalable Algorithms¶
Resource augmentation for scheduling, introduced by Kalyanasundaram and Pruhs ("Speed is as Powerful as Clairvoyance," FOCS 1995 / JACM 2000), gives the online algorithm machines that run faster than OPT's, then asks for a competitive ratio.
Definition (speed augmentation). An online algorithm is
s-speedc-competitive if, when its machines run at speeds ≥ 1while OPT's run at speed1, its objective is at mostc · OPTfor every instance.
The headline result rescues flow time:
Theorem (Kalyanasundaram–Pruhs 2000). SRPT (shortest-remaining-processing-time) and SETF (shortest-elapsed-time-first, the non-clairvoyant analogue) are
(1 + ε)-speedO(1/ε)-competitive for total flow time. With even a tiny speed advantageε, the unbounded competitive ratio collapses to a constant.
The slogan "speed is as powerful as clairvoyance" is exact: SETF, which does not know job sizes, achieves with (1 + ε) speed essentially what the clairvoyant SRPT achieves with no augmentation — the speed makes up for not seeing the future.
Scalable algorithms¶
The gold standard the framework defines:
Definition (scalable). An algorithm is scalable if for every
ε > 0it is(1 + ε)-speedO(f(ε))-competitive — arbitrarily close to OPT's resources buys anO(1)ratio (with the constant possibly blowing up asε → 0).
Scalable is the "online-optimal" badge for objectives with no constant-competitive algorithm: it says the only obstruction is the resource gap, and any positive resource advantage closes it. Major scalable results:
- SRPT / SETF — scalable for total flow time (Kalyanasundaram–Pruhs).
- Weighted flow time — scalable algorithms via the
k-th power / highest-density-first family and potential-function analyses (Bansal–Pruhs; Bansal–Dhamdhere). Weighted flow time has noO(1)-competitive online algorithm either, and is rescued only by augmentation. ℓ_pnorms of flow time / stretch — scalable via amortized-potential arguments (Bansal–Pruhs, "Server Scheduling in theℓ_pNorm").- Online primal-dual (Anand–Garg–Kumar; Devanur–Huang; Im–Kulkarni–Munagala) gave a unified recipe: write the scheduling LP, run the multiplicative/primal-dual update, and the dual feasibility certifies a
(1 + ε)-speedO(1)-competitive bound. This is the modern engine for scalability proofs.
The proof shape (SRPT flow time, augmented)¶
The standard argument is a local-competitiveness / potential proof. Define a potential Φ(t) measuring, at each instant, how far the algorithm's set of unfinished jobs (its remaining-work profile) lags OPT's. The instantaneous flow-time rate is the number of unfinished jobs n_{ALG}(t); we want ∫ n_{ALG} dt ≤ O(1)·∫ n_{OPT} dt. The running condition to establish is, at every instant,
which telescopes (integrates) to ∫ n_{ALG} ≤ c·∫ n_{OPT} + Φ(0) − Φ(∞). The speed advantage s = 1 + ε enters precisely here: because the algorithm processes work (1 + ε) times faster, whenever it is "behind" OPT in remaining work, its higher throughput drives dΦ/dt sufficiently negative to absorb the n_{ALG} term, making the running inequality hold with c = O(1/ε). Without the speed boost (ε = 0) the potential cannot pay for n_{ALG}, and indeed no constant c exists — the impossibility result. The ε of extra speed is exactly the budget that makes the potential argument close. This is the same ALG_move + ΔΦ ≤ c·OPT_move skeleton as the potential method for competitive proofs, specialized to a continuous-time flow-time integral.
Online Bin Packing, Advanced: Harmonic to 1.578¶
Bin packing: items of size in (0, 1] arrive online; pack each, on arrival, into a unit-capacity bin (open a new one if needed); minimize the number of bins. The middle level covers First-Fit (1.7·OPT + O(1)) and the 4/3 lower bound. Senior bin packing is the decades-long race to the optimal asymptotic competitive ratio.
The Harmonic family¶
The structural problem with First-Fit is that two items of size 0.51 waste nearly half a bin each and can never share. The Harmonic algorithm (Lee–Lee 1985) fixes this by classifying items by size class and packing each class into its own dedicated bins:
- Partition
(0, 1]into intervals(1/(k+1), 1/k]fork = 1, …, M−1and a tail(0, 1/M]. - An item of class
kgoes only into bins reserved for classk, each of which holds exactlyksuch items.
This bounded-space algorithm (only O(M) open bins at once) is ≈ 1.691-competitive as M → ∞ — the value Π_∞ ≈ 1.6910 is provably the best possible for any bounded-space online algorithm (Lee–Lee). To beat 1.691 you must use unbounded space (keep arbitrarily many bins open) and combine size classes cleverly.
Super-Harmonic and Harmonic++¶
The improvements pair up complementary classes — e.g. a large item near 2/3 with a small item near 1/3 — so the wasted space in one class is filled by another. This is the Super-Harmonic framework (Seiden 2002), which unifies Harmonic, Refined-Harmonic (Lee–Lee, 1.636), Modified-Harmonic, and Harmonic++ (Seiden), the best Super-Harmonic algorithm at ≈ 1.58889. Seiden also proved Super-Harmonic algorithms cannot beat ≈ 1.583, so a fundamentally new idea was needed to go lower.
The modern frontier¶
Theorem (Balogh, Békési, Dósa, Epstein, Levin 2018). There is an online bin-packing algorithm with asymptotic competitive ratio
≈ 1.5783— beating the Super-Harmonic barrier — and no online algorithm has asymptotic ratio below≈ 1.5403.
The 1.5403 lower bound (Balogh–Békési–Dósa–Sgall–van Stee, sharpening the long-standing 1.5401) and the 1.5783 upper bound bracket the true online bin-packing constant into a narrow but still-open window. The senior point: bin packing is not solved the way identical-machine makespan is; the optimal asymptotic ratio is one of the genuinely open constants in online algorithms.
| Algorithm | Asymptotic ratio | Space |
|---|---|---|
| Next-Fit | 2.0 | bounded (1 open bin) |
| First-Fit | 1.7 | unbounded |
| Harmonic (Lee–Lee 1985) | ≈ 1.691 | bounded (optimal for bounded-space) |
| Refined-Harmonic | ≈ 1.636 | unbounded |
| Harmonic++ (Seiden 2002) | ≈ 1.589 | unbounded |
| Balogh et al. 2018 | ≈ 1.578 | unbounded |
| Lower bound (Balogh et al.) | ≈ 1.5403 | — |
Bin covering¶
The dual objective, bin covering — fill bins to at least 1, maximize the number of covered bins — has its own online theory (Csirik–Totik): the best online competitive ratio is 1/2, a sharp contrast to packing, reflecting that covering is the harder direction to do online.
Vector Bin Packing and VM Placement¶
The objective that matters for cloud infrastructure is vector (multi-dimensional) bin packing: each item is a d-vector of resource demands (CPU, memory, disk, bandwidth), each bin a d-vector of capacities, and an item fits only if it fits in every coordinate. This is exactly VM placement / container bin-packing on physical hosts.
The dimensionality is punishing:
- Offline vector bin packing is APX-hard and the best approximation is
O(log d)(or1 + ε·d-style trade-offs); there is no PTAS ford ≥ 2(unlike the1-D asymptotic PTAS of Fernandez de la Vega–Lueker). - Online, First-Fit-style algorithms are
Θ(d)-competitive — the competitive ratio grows linearly in the number of resource dimensions, and a matchingΩ(d / log d)-type lower bound holds. Each additional constrained resource dimension makes online packing strictly harder.
The practical consequence for a senior engineer designing a scheduler (Kubernetes, Borg, a VM placement service): per-host bin packing across multiple resources is provably hard online, the 1-D 1.578 intuition does not transfer, and real schedulers therefore lean on (a) repacking / migration (turning the online problem partially offline — see the rent-or-buy connection below), (b) dominant-resource heuristics that collapse d dimensions to one, and (c) overcommit with statistical multiplexing rather than worst-case packing. The theory says no online rule will be within a constant factor across many tight resources.
Connections: k-Server, MTS, and Rent-or-Buy¶
Online scheduling does not live alone; it shares machinery with the rest of 25-online-algorithms.
- Metrical task systems and
k-server. Load balancing with migration cost — where reassigning a job to another machine costs a movement penalty — is a metrical task system, the abstraction that generalizes both paging and thek-server problem. The exponential-potential assignment rule used for unrelated machines is a sibling of the work-function and potential arguments ink-server; both penalize concentration via a convex potential. When jobs can be migrated between machines at a cost, the analysis borrows directly from MTS competitive theory. - Rent-or-buy / ski rental in scheduling. Many scheduling sub-decisions are rent-or-buy (ski rental) problems: Should I keep a machine powered on (paying per-unit "rent") or accept the one-time "buy" cost of spinning a new one up? — the core of machine activation / dynamic provisioning and energy-aware scheduling. The deterministic
2-competitive and randomizede/(e−1)-competitive ski-rental strategies transfer directly to deciding when to acquire capacity. Energy-aware speed scaling (Yao–Demers–Shenker; Bansal–Kimbrel–Pruhs) — choosing processor speeds to trade energy against flow time — is itself analyzed with the same speed-augmentation and potential framework as flow-time scheduling, unifying the two threads.
The senior view: scheduling, paging, k-server, and rent-or-buy are dialects of one language — online cost minimization analyzed by potentials (upper bounds) and adversary constructions or Yao (lower bounds). Recognizing a scheduling decision as a disguised ski-rental or MTS instance is what lets you import a tight bound instead of re-deriving one.
Learning-Augmented Scheduling¶
The newest layer — mirroring learning-augmented caching — feeds the scheduler ML predictions and asks for guarantees that interpolate between "near-optimal when predictions are good" (consistency) and "no worse than the classical worst case when they are garbage" (robustness).
- Predicted job sizes for flow time. Purohit, Svitkina, and Kumar ("Improving Online Algorithms via ML Predictions," NeurIPS 2018) give non-clairvoyant scheduling that, with predicted processing times, runs a preferential round-robin / SPJF blend controlled by a confidence parameter
λ: it is(1 + λ)/(2λ)-consistent and(1 + λ)/(1 − λ)-robust for total completion time —λ → 1trusts the predictor (near-optimal SPT-like behavior),λ → 0falls back to robust round-robin. The same paper's ski-rental-with-predictions result is the companion template. - Permutation / order predictions. Lattanzi, Lavastida, Moseley, and Vassilvitskii ("Online Scheduling via Learned Weights," SODA 2020) predict a per-machine weight (a proxy for the right assignment) for restricted-assignment makespan, achieving a competitive ratio of
O(log(1/η))for predictions of qualityη— degrading gracefully fromO(1)(good predictions) toward the worst-caseO(log m)(bad predictions), exactly the consistency/robustness interpolation, here beating the prediction-freeΘ(log m)barrier when predictions are decent. - The general pattern. As in caching, the recipe is a combiner: run the prediction-following policy but cap its damage with a robust classical backbone (round-robin for flow time, greedy/exponential-potential for makespan), tuned by a confidence knob. The predictions let well-behaved instances escape the pessimistic lower bounds (
log mfor unrelated machines, unboundedness for flow time) while the backbone preserves the worst-case safety net.
This is an active frontier; the senior takeaway is that predictions are the principled way to beat the Θ(log m) and flow-time impossibility barriers on realistic inputs without abandoning worst-case guarantees — the scheduling analogue of Belady-with-noise.
Decision Framework¶
| Situation | Reach for |
|---|---|
| Identical machines, makespan, simplicity matters | Greedy list scheduling (2 − 1/m); good enough unless you need sub-2 |
| Identical machines, you must beat 2 | Imbalance algorithm (Fleischer–Wahl 1.9201 / Albers 1.923) — deliberately keep machines underloaded |
| Machines have one speed each (related) | O(1)-competitive algorithms exist — heterogeneity with structure is cheap |
| Unrelated machines / restricted assignment | Exponential-potential greedy, Θ(log m) — the logarithm is unavoidable online |
| Minimizing flow / completion time, no resources to spare | Impossible to be O(1)-competitive — change the question |
| Flow time with a little extra hardware budget | Speed augmentation: SRPT/SETF are (1+ε)-speed O(1)-competitive (scalable) |
Weighted / ℓ_p flow time | Scalable via highest-density-first or online primal-dual |
| 1-D bin packing, best ratio | Unbounded-space Harmonic++ / Balogh et al. (≈ 1.578); Harmonic (1.691) if memory is bounded |
| Multi-resource VM / container placement | Vector bin packing — Θ(d) online; lean on migration, dominant-resource, overcommit |
| "Power on a machine vs. spin up a new one" | Rent-or-buy / ski rental (det. 2, rand. e/(e−1)) |
| You have an ML predictor of job sizes / order | Learning-augmented scheduling (Purohit et al.; Lattanzi et al.) — consistency + robustness |
Research Pointers¶
- Graham, R. L. (1966/1969). "Bounds for Certain Multiprocessing Anomalies." Bell System Tech. J. / SIAM J. Appl. Math. List scheduling
2 − 1/m; LPT4/3 − 1/(3m). The origin of the field. - Bartal, Y., Fiat, A., Karloff, H., & Vohra, R. (1992/1995). "New Algorithms for an Ancient Scheduling Problem." STOC / JCSS. First deterministic algorithm beating
2for identical-machine makespan (2 − 1/70), and randomized4/3form = 2. - Albers, S. (1999). "Better Bounds for Online Scheduling." SIAM J. Comput. The
1.923-competitive algorithm and improved lower bound for identical machines. - Fleischer, R., & Wahl, M. (2000). "Online Scheduling Revisited." J. Scheduling. The
≈ 1.9201-competitive algorithm matching the lower bound. - Rudin, J. F. (2001). Improved Bounds for the Online Scheduling Problem (PhD thesis, UT Dallas). The
≈ 1.9201lower bound for deterministic identical-machine makespan. - Aspnes, J., Azar, Y., Fiat, A., Plotkin, S., & Waarts, O. (1997). "On-line Routing of Virtual Circuits with Applications to Load Balancing and Machine Scheduling." JACM 44(3). Greedy
O(log m)on unrelated machines and theΩ(log m)lower bound; the exponential potential. - Kalyanasundaram, B., & Pruhs, K. (1995/2000). "Speed is as Powerful as Clairvoyance." FOCS / JACM 47(4). The speed-augmentation framework; SRPT/SETF
(1+ε)-speedO(1)-competitive for flow time. - Bansal, N., & Pruhs, K. (2003/2010). "Server Scheduling in the
ℓ_pNorm" / "Speed Scaling..." Scalable algorithms for weighted andℓ_p-norm flow time via potentials. - Anand, S., Garg, N., & Kumar, A. (2012). "Resource Augmentation for Weighted Flow-Time Explained by Dual Fitting." SODA. The online primal-dual / dual-fitting recipe for scalability.
- Lee, C. C., & Lee, D. T. (1985). "A Simple On-line Bin-Packing Algorithm." JACM 32(3). Harmonic; the
1.691bounded-space optimum. - Seiden, S. S. (2002). "On the Online Bin Packing Problem." JACM 49(5). The Super-Harmonic framework and Harmonic++ (
≈ 1.589). - Balogh, J., Békési, J., Dósa, G., Epstein, L., & Levin, A. (2018/2019). "A New and Improved Algorithm for Online Bin Packing" / lower bound. ESA / SIAM J. Comput. Upper bound
≈ 1.578; lower bound≈ 1.5403. - Purohit, M., Svitkina, Z., & Kumar, R. (2018). "Improving Online Algorithms via ML Predictions." NeurIPS. Non-clairvoyant scheduling and ski rental with predicted sizes; consistency/robustness.
- Lattanzi, S., Lavastida, T., Moseley, B., & Vassilvitskii, S. (2020). "Online Scheduling via Learned Weights." SODA. Restricted-assignment makespan with per-machine weight predictions.
- Borodin, A., & El-Yaniv, R. (1998). Online Computation and Competitive Analysis. Cambridge Univ. Press. The definitive reference for the competitive machinery underlying all of the above.
Key Takeaways¶
- Beating Graham's
2 − 1/mrequires deliberate imbalance. The deterministic identical-machine makespan ratio is≈ 1.9201— achieved by Fleischer–Wahl (1.9201) and Albers (1.923), matched by Rudin's lower bound. Pure greedy is stuck near2because it leaves no slack for a future large job; the sub-2 algorithms reserve underloaded machines on purpose. - The lower bound is a geometric-phase adversary with an optimal stopping time. It feeds geometrically growing jobs and stops at the phase maximizing
ALG/OPT, forcing any online algorithm into a balance-vs-hedge dilemma that yields≈ 1.9201. Randomized makespan improves the constant towarde/(e−1) ≈ 1.582(tight at4/3form = 2) but not below. - Unrelated machines cost a logarithm. Greedy via the exponential potential is
O(log m)-competitive (Aspnes–Azar–Fiat–Plotkin–Waarts 1997), matched byΩ(log m); restricted assignment is alreadyΘ(log m), while related (single-speed) machines stayO(1). The convex potential makes concentration exponentially expensive, self-limiting the max load. - Flow time has no
O(1)-competitive online algorithm — resource augmentation rescues it. Kalyanasundaram–Pruhs (1995): "speed is as powerful as clairvoyance" — SRPT/SETF are(1+ε)-speedO(1/ε)-competitive, and theεof extra speed is exactly the budget that makes the potential argumentn_{ALG} + dΦ/dt ≤ c·n_{OPT}close. Scalable =(1+ε)-speedO(1)-competitive is the online-optimality badge; online primal-dual is the modern proof engine. - Online bin packing's optimal constant is open, in
[1.5403, 1.578]. Harmonic (1.691) is the bounded-space optimum; Super-Harmonic / Harmonic++ (1.589) and Balogh–Békési–Dósa–Epstein–Levin (1.578) are the unbounded-space frontier. Vector bin packing (VM placement) isΘ(d)-competitive — each resource dimension makes online packing strictly harder. - Scheduling shares machinery with
k-server, MTS, and rent-or-buy. Migration-cost load balancing is an MTS; machine activation and power-on decisions are ski rental (det.2, rand.e/(e−1)). See../04-k-server-problem/senior.md. - Learning-augmented scheduling beats the impossibility barriers on realistic inputs. Predicted job sizes (Purohit et al. 2018) and permutation/weight predictions (Lattanzi et al. 2020) give consistency/robustness trade-offs that escape
Θ(log m)and flow-time unboundedness when predictions are good, with a robust classical backbone as the safety net.
See also:
./middle.mdfor list scheduling, LPT, and basic bin packing ·./professional.mdfor production scheduler engineering ·../01-competitive-analysis/senior.mdfor potentials, Yao, and the adversary hierarchy ·../03-paging-and-caching-theory/senior.mdfor the learning-augmented template ·../04-k-server-problem/senior.mdfor MTS and the general metric problem
In this topic
- junior
- middle
- senior
- professional