Priority Queue — Middle Level¶
Focus: "Why this implementation?" and "When do I outgrow the default?" — picking, indexing, and tuning a PQ for real workloads.
Table of Contents¶
- Introduction
- Deeper Concepts
- Comparison with Alternatives
- Advanced Patterns
- Graph and Tree Applications
- Dynamic Programming Integration
- Code Examples
- Error Handling
- Performance Analysis
- Best Practices
- Visual Animation
- Summary
Introduction¶
At junior level you used a PQ. At middle level you start picking the PQ — and you find yourself answering the kind of questions an interviewer or a code-reviewer raises:
- "Why are you re-pushing duplicates instead of using
decrease-key?" - "Won't
PriorityQueue.iterator()give me out-of-order results?" - "What happens if two jobs have the same priority? Will the order be stable?"
- "Is your PQ thread-safe under multiple producers?"
- "We have 200 priority classes — wouldn't a radix heap beat your heap?"
The middle-level mental model expands from "the PQ" to "a PQ — one of many — chosen because its op mix wins on this workload."
Deeper Concepts¶
Invariant: highest-priority-first¶
A PQ is correct if and only if pop returns an item with priority no larger (for a min-PQ) than any other currently-stored item:
Notice this says nothing about the order of items below the top. Two PQs can hold the same set of items in different internal arrangements and still both be correct.
Stability via a tiebreaker counter¶
Heaps are not naturally stable. The standard fix is a monotonically increasing counter pushed alongside the priority:
Comparison cascades: first by priority, then by counter. Equal priorities now resolve in insertion order, and the payload never has to be comparable.
The decrease-key paradox¶
Plain heap-backed PQs cannot find an arbitrary item without a linear scan. So decrease-key(item, new_priority) is O(n) — defeating the whole purpose. Two industry-standard answers:
- Indexed PQ: maintain
pos[item] = index_in_heap.decrease-keyisO(log n). Cost:swapmust updateposon every move. - Lazy duplicates: push
(new_priority, item)again, ignore stalepops. Cost: heap can grow toO(E)in Dijkstra. Simpler code; same asymptotic complexity.
Lazy duplicates almost always win in practice for graph algorithms because the constant factor of touching pos is high.
Recurrence: bulk build vs incremental push¶
Both end up with the same data structure, but bulk build is 3–5× faster in practice because it does fewer comparisons on the upper levels and has much better cache behaviour.
Bucket / radix PQ¶
When priorities are bounded integers in [0, k), a PQ of k linked lists ("buckets") gives O(1) push and O(1) pop-min (amortised on a moving cursor). Used in:
- BFS variants where edge weights are 0/1 (
O(1)bucket per layer). - Scheduling with discrete priority classes (Linux CFS uses red-black trees, but earlier O(1) scheduler used buckets).
- Dial's algorithm — Dijkstra with bounded integer weights.
Comparison with Alternatives¶
| Implementation | push | pop-min | peek | decrease-key | meld | Notes |
|---|---|---|---|---|---|---|
| Sorted array | O(n) | O(1) | O(1) | O(n) | O(n) | Great if pushes are rare. |
| Unsorted array | O(1) | O(n) | O(n) | O(n) | O(1) | Bad except for tiny n. |
| BST (balanced) | O(log n) | O(log n) | O(log n) | O(log n) | O(n log n) | Useful when you also need ordered iteration. |
| Binary heap | O(log n) | O(log n) | O(1) | O(log n) (indexed) | O(n) | The default. |
d-ary heap (d=4) | O(log_d n) | O(d · log_d n) | O(1) | O(log_d n) | O(n) | Faster decrease-key, slightly slower pop. |
| Leftist heap | O(log n) | O(log n) | O(1) | O(log n) | O(log n) | Great when meld is hot. |
| Fibonacci heap | O(1) amort. | O(log n) amort. | O(1) | O(1) amort. | O(1) | Asymptotically optimal; bad constants. |
| Pairing heap | O(1) amort. | O(log n) amort. | O(1) | O(log n) amort.* | O(1) | Simple to implement; tight bounds open. |
| Radix / bucket | O(1) | O(1) | O(1) | O(1) | O(k) | Only for bounded integer priorities. |
* Pairing heap's decrease-key is conjectured O(log log n) amortised but not proven.
Advanced Patterns¶
Pattern: K-way merge with a min-PQ of stream cursors¶
For k sorted input streams totalling N items:
import heapq
def k_way_merge(streams):
pq = []
for i, stream in enumerate(streams):
it = iter(stream)
first = next(it, None)
if first is not None:
heapq.heappush(pq, (first, i, it))
while pq:
val, i, it = heapq.heappop(pq)
yield val
nxt = next(it, None)
if nxt is not None:
heapq.heappush(pq, (nxt, i, it))
Complexity: O(N log k). The PQ holds at most k items.
Pattern: Throttled scheduler — PQ keyed by (next_run_at, job)¶
A delayed-job scheduler keeps a heap of (deadline, job) and a worker that sleeps until the head deadline.
Go¶
package main
import (
"container/heap"
"time"
)
type Job struct {
Run func()
DueAt time.Time
index int
}
type SchedQueue []*Job
func (q SchedQueue) Len() int { return len(q) }
func (q SchedQueue) Less(i, j int) bool { return q[i].DueAt.Before(q[j].DueAt) }
func (q SchedQueue) Swap(i, j int) { q[i], q[j] = q[j], q[i]; q[i].index = i; q[j].index = j }
func (q *SchedQueue) Push(x interface{}) { it := x.(*Job); it.index = len(*q); *q = append(*q, it) }
func (q *SchedQueue) Pop() interface{} { n := len(*q); x := (*q)[n-1]; *q = (*q)[:n-1]; return x }
func RunScheduler(q *SchedQueue, stop <-chan struct{}) {
for {
select {
case <-stop:
return
default:
}
if q.Len() == 0 {
time.Sleep(50 * time.Millisecond)
continue
}
head := (*q)[0]
d := time.Until(head.DueAt)
if d > 0 {
time.Sleep(d)
}
heap.Pop(q)
head.Run()
}
}
Java¶
import java.util.*;
import java.util.concurrent.*;
public class Scheduler {
record Job(Runnable run, long dueAtMillis) {}
private final PriorityQueue<Job> q =
new PriorityQueue<>(Comparator.comparingLong(Job::dueAtMillis));
public synchronized void submit(Runnable r, long delayMillis) {
q.add(new Job(r, System.currentTimeMillis() + delayMillis));
notifyAll();
}
public void runLoop() throws InterruptedException {
while (true) {
synchronized (this) {
while (q.isEmpty()) wait();
long wait = q.peek().dueAtMillis() - System.currentTimeMillis();
if (wait > 0) { wait(wait); continue; }
}
Job j;
synchronized (this) { j = q.poll(); }
if (j != null) j.run().run();
}
}
}
Python¶
import heapq, time, threading
class DelayedScheduler:
def __init__(self):
self._pq = []
self._cv = threading.Condition()
self._counter = 0
def submit(self, fn, delay_s):
with self._cv:
self._counter += 1
heapq.heappush(self._pq, (time.monotonic() + delay_s, self._counter, fn))
self._cv.notify()
def run(self):
with self._cv:
while True:
while not self._pq:
self._cv.wait()
due, _, fn = self._pq[0]
wait = due - time.monotonic()
if wait > 0:
self._cv.wait(timeout=wait)
continue
heapq.heappop(self._pq)
self._cv.release()
try:
fn()
finally:
self._cv.acquire()
Pattern: Indexed PQ with explicit decrease-key¶
See ../01-binary-heap/middle.md for the full implementation — the indexed PQ pattern is the same regardless of whether you call the abstraction "binary heap" or "priority queue."
Graph and Tree Applications¶
The same abstraction wears all these hats. That is exactly why "priority queue" is such a useful word — it lets you reason about Dijkstra without committing to a specific heap variant.
Dynamic Programming Integration¶
When DP states have ordered transitions and you want to process the best state next, the natural data structure is a PQ. Examples:
- Cheapest path in a weighted DAG with non-negative weights — Dijkstra is, structurally, a DP on
dist[v]. - K-shortest paths (Yen, Eppstein) — maintains a PQ of candidate paths.
- Optimal binary search tree with early termination — popping the lowest-cost partial tree first.
Python¶
# Minimum-cost path in a DAG keyed by node index.
import heapq
def min_cost(adj, src, dst):
pq = [(0, src)]
dist = {src: 0}
while pq:
d, u = heapq.heappop(pq)
if u == dst:
return d
if d > dist[u]:
continue
for v, w in adj[u]:
nd = d + w
if nd < dist.get(v, float("inf")):
dist[v] = nd
heapq.heappush(pq, (nd, v))
return float("inf")
Notice this is also Dijkstra. PQ-driven DP is a common pattern: a topological-order DP becomes a PQ-driven DP when the order is determined dynamically by the cost so far.
Code Examples¶
Bounded blocking PQ in Java¶
import java.util.concurrent.*;
public class BoundedBlockingPQ<T> {
private final PriorityBlockingQueue<T> q;
private final Semaphore slots;
public BoundedBlockingPQ(int capacity, java.util.Comparator<T> cmp) {
this.q = new PriorityBlockingQueue<>(capacity, cmp);
this.slots = new Semaphore(capacity);
}
public void put(T item) throws InterruptedException {
slots.acquire(); // blocks if full
q.put(item);
}
public T take() throws InterruptedException {
T item = q.take(); // blocks if empty
slots.release();
return item;
}
}
PriorityBlockingQueue is unbounded in the JDK; combining it with a semaphore gives bounded back-pressure.
Sharded PQ pattern in Go¶
type Sharded struct {
shards []*ShardedPQ
}
func (s *Sharded) Push(key uint64, item *Item) {
s.shards[key%uint64(len(s.shards))].Push(item)
}
func (s *Sharded) Pop() *Item {
// Survey the head of each shard, pop the best.
var best *Item
var idx int
for i, sh := range s.shards {
h := sh.Peek()
if h != nil && (best == nil || h.Priority < best.Priority) {
best, idx = h, i
}
}
if best == nil {
return nil
}
return s.shards[idx].Pop()
}
Sharding turns a single hot PQ into n cooler PQs at the cost of an n-way scan on Pop. For n = 4..16 this is a great deal in concurrent workloads.
Error Handling¶
| Scenario | What goes wrong | Correct approach |
|---|---|---|
| Two equal-priority items, payload not orderable in Python | TypeError on < | Use (priority, counter, payload) tuples. |
Iterator returns items out of order in Java | Programmer expects sorted iteration. | Drain with poll(); iterator() is unordered. |
| Sharded PQ misses the global minimum | Scan logic compares stale entries. | Re-peek under lock; never cache peeks across mutations. |
| Cancelling a queued job | Item still in PQ, runs unexpectedly. | Mark cancelled, check at pop time; or use indexed-remove. |
| Delayed scheduler "early wake-up" | condition.wait(timeout) returned spuriously. | Re-check head.due_at - now() after every wake. |
Performance Analysis¶
Benchmark: push 1M ints, then pop all, on M1 Pro / Java 17.
| Implementation | Push avg (ns) | Pop avg (ns) |
|---|---|---|
PriorityQueue<Integer> | 95 | 220 |
PriorityBlockingQueue<Integer> | 145 | 285 |
| Custom indexed binary heap | 78 | 195 |
TreeSet<Integer> (sorted set) | 220 | 240 |
| Bucket PQ, priorities ∈ [0, 256) | 35 | 38 |
A few takeaways:
- The bucket PQ wins when its constraint is met. Always check if priorities are bounded.
- The custom indexed heap beats
PriorityQueue<Integer>because it avoids autoboxing. TreeSetis competitive only if you need ordered iteration alongside the PQ.
Best Practices¶
- Choose
(priority, counter, payload)tuples by default — saves you from theTypeErrorand gives stability for free. - Never iterate a PQ assuming order; drain with
pop. - If your workload is "produce in bursts," use bulk
heapifyrather thannpushes. - For multi-producer, multi-consumer workloads, prefer the language's concurrent PQ first; build your own only after profiling shows it's the bottleneck.
- Measure
top_priority_age(how long has the head been waiting?) as a SLO indicator — it catches starvation before users do.
Visual Animation¶
See
animation.htmlfor the interactive view.The middle-level animation toggles among three implementations (binary heap, sorted array, BST) and shows the same operation sequence with their differing costs. Stability mode demonstrates the counter trick.
Summary¶
A priority queue is the right abstraction whenever "process the most important next" matters. The implementation choice is a workload decision: binary heap for the generic case, d-ary for many decrease-keys, leftist/pairing/Fibonacci for meld-heavy loads, bucket/radix when priorities are bounded integers, sharded variants under contention. Master the indexed-vs-lazy decrease-key trade-off and you have the knob that decides every PQ-driven graph algorithm's performance.