Concurrency — Practice Tasks¶
12 hands-on exercises on writing correct concurrent code: data races, critical-section scope, thread-safe collections, lazy initialization, back-pressure, producer-consumer, cancellation, deadlock, and atomics. Every task gives a scenario, racy or unsafe code, an instruction, and a full solution with reasoning. Languages vary (Go / Java / Python). Ordered easy → hard.
Table of Contents¶
- Task 1 — Fix a data race with a mutex (Go)
- Task 2 — Make a counter correct with atomics (Java)
- Task 3 — Fix a race by switching to message passing (Go)
- Task 4 — Replace a thread-unsafe collection (Java)
- Task 5 — Shrink an over-broad critical section (Go)
- Task 6 — Fix broken double-checked locking (Java)
- Task 7 — Lazy init done right in Python (
functools.cache) - Task 8 — Add back-pressure with a bounded queue (Go)
- Task 9 — Correct producer-consumer (Python)
- Task 10 — Context, cancellation, and graceful shutdown (Go)
- Task 11 — Fix a deadlock via lock ordering (Java)
- Task 12 — Bound concurrency with a semaphore (Go, open-ended)
- Self-Assessment
- Related Topics
How to Use¶
Each task is self-contained. Read the scenario, study the broken code, and write your fix before opening the solution. The goal is not just compiling code — it is code that is correct under the language's memory model and stays correct under contention.
A reliable workflow for every fix:
- Name the hazard precisely: data race, lost update, visibility (stale read), deadlock, livelock, or unbounded growth.
- Reproduce it. Concurrency bugs hide on a quiet machine. Use the race detector and load.
- Apply the smallest correct primitive — not the biggest lock you can find.
- Re-run under contention to confirm the fix and check you did not serialize a hot path into a bottleneck.
Tools that turn "works on my laptop" into evidence:
# Go: the race detector instruments memory accesses and reports races at runtime.
go test -race ./...
go run -race main.go
# Java: stress with jcstress (concurrency test harness) or run thousands of iterations.
# Python: the GIL hides some races but NOT compound operations or I/O interleaving.
python -X dev script.py
Task 1 — Fix a data race with a mutex (Go)¶
Difficulty: Easy
Scenario: A request handler increments a shared hit counter from many goroutines. Under load the reported count is lower than the number of requests, and go test -race flags a data race.
package metrics
type Counter struct {
total int64
}
func (c *Counter) Inc() {
c.total++ // read-modify-write: not atomic, multiple goroutines collide
}
func (c *Counter) Total() int64 {
return c.total // racy read concurrent with Inc
}
Instruction: Make Counter safe for concurrent use by guarding the shared field with a sync.Mutex. Keep the public API unchanged.
Solution
**Reasoning:** `c.total++` compiles to load, add, store. Two goroutines can both load the same value, both add one, and both store — one increment is lost. The fix is not merely "the writes are unsafe": *both* the write in `Inc` and the read in `Total` must be guarded by the **same** mutex. A read concurrent with a write is itself a data race, and the reader may observe a torn or stale value. Guarding only the writer leaves `Total` racy. Note `defer` in `Total` (the body is a single return, so unlocking on the deferred path is clean) but a plain `Unlock` in `Inc` (the critical section is one line, and avoiding `defer` shaves a tiny cost on a hot path — a reasonable micro-optimization once measured). For a counter specifically, an atomic is even better — see Task 2.Task 2 — Make a counter correct with atomics (Java)¶
Difficulty: Easy
Scenario: A rate limiter tracks how many requests it has seen. The field is plain long, mutated from a thread pool. Counts are wrong and occasionally a reader sees a value that never existed.
public class RequestMeter {
private long count; // not safe to publish across threads
private boolean tripped; // set when count crosses a threshold
public void record() {
count++; // lost updates under contention
if (count > 1000) {
tripped = true; // visibility: other threads may never see this
}
}
public long count() { return count; }
public boolean isTripped() { return tripped; }
}
Instruction: Replace the primitives with the right java.util.concurrent.atomic types so increments never collide and readers always see the latest value, without taking a lock.
Solution
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.atomic.AtomicBoolean;
public class RequestMeter {
private final AtomicLong count = new AtomicLong();
private final AtomicBoolean tripped = new AtomicBoolean();
public void record() {
long now = count.incrementAndGet(); // single atomic read-modify-write
if (now > 1000) {
tripped.set(true); // atomic store, visible to all threads
}
}
public long count() { return count.get(); }
public boolean isTripped() { return tripped.get(); }
}
Task 3 — Fix a race by switching to message passing (Go)¶
Difficulty: Easy–Medium
Scenario: A worker pool accumulates results into a shared map. The map is mutated from N goroutines with no synchronization — Go panics with concurrent map writes, or silently corrupts the map.
func aggregate(ids []int) map[int]int {
results := make(map[int]int)
var wg sync.WaitGroup
for _, id := range ids {
wg.Add(1)
go func(id int) {
defer wg.Done()
results[id] = expensiveLookup(id) // concurrent map write — fatal
}(id)
}
wg.Wait()
return results
}
Instruction: Fix the race without a mutex, by having workers send results over a channel and a single owner goroutine (or the main goroutine) build the map. "Don't communicate by sharing memory; share memory by communicating."
Solution
func aggregate(ids []int) map[int]int {
type pair struct {
id, value int
}
ch := make(chan pair, len(ids)) // buffered so workers never block on send
var wg sync.WaitGroup
for _, id := range ids {
wg.Add(1)
go func(id int) {
defer wg.Done()
ch <- pair{id, expensiveLookup(id)}
}(id)
}
// Close the channel once all workers are done, from a separate goroutine,
// so the range loop below terminates.
go func() {
wg.Wait()
close(ch)
}()
results := make(map[int]int) // owned exclusively by this goroutine
for p := range ch {
results[p.id] = p.value
}
return results
}
Task 4 — Replace a thread-unsafe collection (Java)¶
Difficulty: Medium
Scenario: A session cache uses a plain HashMap. It is read constantly and written occasionally from many threads. In production it sometimes throws ConcurrentModificationException, and under heavy resize load a thread has spun at 100% CPU inside HashMap.get (a corrupted internal linked list).
public class SessionCache {
private final Map<String, Session> cache = new HashMap<>(); // not thread-safe
public Session get(String id) {
return cache.get(id);
}
public void put(String id, Session s) {
cache.put(id, s); // concurrent put during resize can corrupt the table
}
public Session getOrCreate(String id) {
Session s = cache.get(id);
if (s == null) { // check-then-act race
s = new Session(id);
cache.put(id, s); // two threads can both create + overwrite
}
return s;
}
}
Instruction: Replace the unsafe collection with a proper concurrent one and fix the check-then-act race in getOrCreate so exactly one Session is created per id.
Solution
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
public class SessionCache {
private final Map<String, Session> cache = new ConcurrentHashMap<>();
public Session get(String id) {
return cache.get(id);
}
public void put(String id, Session s) {
cache.put(id, s);
}
public Session getOrCreate(String id) {
// Atomic check-then-act: computeIfAbsent guarantees the mapping
// function runs at most once per absent key.
return cache.computeIfAbsent(id, Session::new);
}
}
Task 5 — Shrink an over-broad critical section (Go)¶
Difficulty: Medium
Scenario: A price service caches quotes. Profiling shows threads queued on a single mutex; the lock is held across a slow network call, serializing every request behind one in-flight fetch.
type PriceService struct {
mu sync.Mutex
cache map[string]float64
client *http.Client
}
func (p *PriceService) Price(symbol string) (float64, error) {
p.mu.Lock()
defer p.mu.Unlock() // held for the ENTIRE function, including the HTTP call
if v, ok := p.cache[symbol]; ok {
return v, nil
}
price, err := p.fetchRemote(symbol) // slow network I/O under the lock
if err != nil {
return 0, err
}
p.cache[symbol] = price
return price, nil
}
Instruction: Narrow the critical section so the mutex protects only the map accesses, never the network call. Keep the cache safe.
Solution
func (p *PriceService) Price(symbol string) (float64, error) {
// Critical section 1: read the cache.
p.mu.Lock()
if v, ok := p.cache[symbol]; ok {
p.mu.Unlock()
return v, nil
}
p.mu.Unlock()
// Slow I/O happens with NO lock held — concurrent requests proceed in parallel.
price, err := p.fetchRemote(symbol)
if err != nil {
return 0, err
}
// Critical section 2: publish the result.
p.mu.Lock()
p.cache[symbol] = price
p.mu.Unlock()
return price, nil
}
Task 6 — Fix broken double-checked locking (Java)¶
Difficulty: Medium–Hard
Scenario: A singleton uses double-checked locking to lazily build an expensive config. The field is a plain reference. On some hardware a thread observes a non-null instance that is only partially constructed — its fields are still default values.
public class Config {
private static Config instance; // NOT volatile — the bug
private final Map<String, String> settings;
private Config() {
this.settings = loadFromDisk(); // expensive
}
public static Config get() {
if (instance == null) { // first check (no lock)
synchronized (Config.class) {
if (instance == null) { // second check (locked)
instance = new Config(); // unsafe publication
}
}
}
return instance;
}
}
Instruction: Fix the broken double-checked locking. Provide the minimal correct fix, then show the idiom most Java developers should prefer instead.
Solution
Minimal fix — make the field `volatile`:public class Config {
private static volatile Config instance; // volatile fixes safe publication
private final Map<String, String> settings;
private Config() {
this.settings = loadFromDisk();
}
public static Config get() {
Config result = instance; // read volatile once into a local
if (result == null) {
synchronized (Config.class) {
result = instance;
if (result == null) {
instance = result = new Config();
}
}
}
return result;
}
}
Task 7 — Lazy init done right in Python (functools.cache)¶
Difficulty: Medium
Scenario: A Python service lazily loads a large model. The hand-rolled double-checked pattern looks safe but has a race: under threads, two callers can both start the expensive load.
import threading
_model = None
_lock = threading.Lock()
def get_model():
global _model
if _model is None: # check-then-act, no lock yet
with _lock:
if _model is None: # second check
_model = load_model() # expensive
return _model
Instruction: The double-checked version above is actually correct in CPython (the GIL makes the reference read/write atomic), but it is verbose and easy to get wrong if copied. Replace it with the idiomatic, provably-once-initialized approach.
Solution
If the lazy value is an attribute of an object rather than a module-level singleton, use `functools.cached_property`: **Reasoning:** The hand-rolled version works in CPython today only because the GIL serializes bytecode and a reference assignment is a single bytecode, so the partial-construction problem from Task 6 cannot occur the same way. But it is fragile: it relies on a CPython implementation detail (free-threaded Python, PEP 703, removes the GIL), and the double-check boilerplate invites copy-paste bugs. `functools.cache` decorates the function with a thread-safe memoization wrapper: the first call computes and stores the result; concurrent and later calls return the stored value. The locking is handled inside the standard library and is correct across interpreter variants. `cached_property` is the per-instance analogue — it computes on first access and replaces itself with the value in the instance `__dict__`, so later accesses skip the descriptor entirely. Both express "compute this exactly once, lazily" as a declaration rather than a manual protocol, which is the whole point of fixing double-checked locking: prefer a vetted idiom over hand-written synchronization.Task 8 — Add back-pressure with a bounded queue (Go)¶
Difficulty: Medium–Hard
Scenario: An ingestion service reads events from a fast source and hands each to a goroutine for processing. There is no limit: when the source outpaces processing, goroutines and memory grow without bound until the process is OOM-killed.
func ingest(events <-chan Event) {
for ev := range events {
go process(ev) // unbounded goroutine spawn — no back-pressure
}
}
Instruction: Add back-pressure so at most N events are processed concurrently and a faster producer is naturally slowed down rather than overwhelming the system. Use a bounded worker pool fed by a bounded channel.
Solution
func ingest(events <-chan Event, workers int) {
// Bounded queue: when full, the loop below blocks on send,
// which propagates back-pressure to the producer of `events`.
jobs := make(chan Event, workers)
var wg sync.WaitGroup
for i := 0; i < workers; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for ev := range jobs {
process(ev) // exactly `workers` of these run at once
}
}()
}
for ev := range events {
jobs <- ev // blocks when all workers are busy AND the buffer is full
}
close(jobs) // signal workers to drain and exit
wg.Wait()
}
func ingest(events <-chan Event, maxConcurrent int) {
sem := make(chan struct{}, maxConcurrent) // capacity = concurrency limit
var wg sync.WaitGroup
for ev := range events {
sem <- struct{}{} // acquire — blocks once maxConcurrent are in flight
wg.Add(1)
go func(ev Event) {
defer wg.Done()
defer func() { <-sem }() // release the slot
process(ev)
}(ev)
}
wg.Wait()
}
Task 9 — Correct producer-consumer (Python)¶
Difficulty: Medium–Hard
Scenario: A producer thread feeds work to consumer threads through a shared list guarded by ad-hoc sleeps. It loses items, busy-waits, and the consumers never shut down cleanly.
import threading, time
buffer = []
done = False
def producer(items):
global done
for item in items:
buffer.append(item) # unsynchronized mutation
done = True
def consumer():
while not done or buffer:
if buffer:
item = buffer.pop(0) # race: two consumers can pop the same index region
handle(item)
else:
time.sleep(0.01) # busy-wait / spin with arbitrary sleep
Instruction: Reimplement producer-consumer correctly using queue.Queue, which provides the locking and blocking for you. Signal completion cleanly and let consumers exit without spinning.
Solution
import queue
import threading
def run(items, num_consumers):
q: queue.Queue = queue.Queue(maxsize=100) # bounded -> built-in back-pressure
def producer():
for item in items:
q.put(item) # blocks if the queue is full (back-pressure)
for _ in range(num_consumers):
q.put(None) # one sentinel per consumer to signal shutdown
def consumer():
while True:
item = q.get() # blocks until an item is available (no spin)
try:
if item is None: # sentinel: time to stop
return
handle(item)
finally:
q.task_done()
prod = threading.Thread(target=producer)
consumers = [threading.Thread(target=consumer) for _ in range(num_consumers)]
prod.start()
for c in consumers:
c.start()
prod.join()
for c in consumers:
c.join()
Task 10 — Context, cancellation, and graceful shutdown (Go)¶
Difficulty: Hard
Scenario: A background poller runs forever in a goroutine. On shutdown (SIGTERM) the process exits immediately, killing in-flight work and leaking the goroutine. There is no way to stop it cleanly or to bound how long an operation may run.
func startPoller(interval time.Duration) {
go func() {
for {
poll() // no way to cancel; no timeout
time.Sleep(interval) // can't be interrupted
}
}()
// nothing ever stops this goroutine
}
Instruction: Rewrite using context.Context so the poller stops promptly on cancellation, each poll is bounded by a timeout, and main waits for a clean shutdown on SIGTERM/SIGINT.
Solution
func startPoller(ctx context.Context, interval time.Duration) <-chan struct{} {
done := make(chan struct{})
go func() {
defer close(done) // signal that the goroutine has fully exited
ticker := time.NewTicker(interval)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return // cancellation requested -> exit the loop and goroutine
case <-ticker.C:
// Bound each poll so one slow call cannot hang shutdown.
pollCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
poll(pollCtx)
cancel()
}
}
}()
return done
}
func main() {
// Root context cancelled on SIGINT/SIGTERM.
ctx, stop := signal.NotifyContext(context.Background(),
os.Interrupt, syscall.SIGTERM)
defer stop()
done := startPoller(ctx, time.Second)
<-ctx.Done() // wait for a signal
log.Println("shutdown signal received, draining...")
// Give the goroutine a bounded window to finish, then give up.
select {
case <-done:
log.Println("poller stopped cleanly")
case <-time.After(10 * time.Second):
log.Println("shutdown timed out; exiting anyway")
}
}
Task 11 — Fix a deadlock via lock ordering (Java)¶
Difficulty: Hard
Scenario: A bank transfer locks the source account, then the destination. When two transfers run in opposite directions at the same time (A→B and B→A), each thread holds one lock and waits forever for the other. Classic deadlock.
class Account {
final ReentrantLock lock = new ReentrantLock();
long balance;
final long id;
Account(long id, long balance) { this.id = id; this.balance = balance; }
}
void transfer(Account from, Account to, long amount) {
from.lock.lock(); // Thread 1: locks A. Thread 2: locks B.
try {
to.lock.lock(); // Thread 1: waits for B. Thread 2: waits for A. DEADLOCK.
try {
from.balance -= amount;
to.balance += amount;
} finally {
to.lock.unlock();
}
} finally {
from.lock.unlock();
}
}
Instruction: Eliminate the deadlock by imposing a global lock ordering so all threads acquire locks in the same order regardless of transfer direction. Then show the alternative using tryLock with backoff.
Solution
Primary fix — total ordering by a stable, unique key (account id):void transfer(Account from, Account to, long amount) {
if (from.id == to.id) {
throw new IllegalArgumentException("cannot transfer to the same account");
}
// Always lock the lower id first -> every thread acquires in the same order,
// so no cycle of "A waits for B while B waits for A" can form.
Account first = from.id < to.id ? from : to;
Account second = from.id < to.id ? to : from;
first.lock.lock();
try {
second.lock.lock();
try {
from.balance -= amount;
to.balance += amount;
} finally {
second.lock.unlock();
}
} finally {
first.lock.unlock();
}
}
void transfer(Account from, Account to, long amount) throws InterruptedException {
while (true) {
if (from.lock.tryLock(100, TimeUnit.MILLISECONDS)) {
try {
if (to.lock.tryLock(100, TimeUnit.MILLISECONDS)) {
try {
from.balance -= amount;
to.balance += amount;
return;
} finally {
to.lock.unlock();
}
}
} finally {
from.lock.unlock(); // release if we failed to get `to`
}
}
Thread.sleep(ThreadLocalRandom.current().nextInt(1, 20)); // backoff, break symmetry
}
}
Task 12 — Bound concurrency with a semaphore (Go, open-ended)¶
Difficulty: Hard
Scenario: A migration tool copies thousands of files to a remote store. The naive version launches one goroutine per file and immediately exhausts file descriptors and remote connection limits. You are asked to make it robust: bound concurrency, propagate the first error and cancel the rest, and return a clean aggregate result.
func migrateAll(files []string) error {
for _, f := range files {
go upload(f) // unbounded; errors ignored; no way to wait or cancel
}
return nil // returns before any upload finishes
}
Instruction: Identify every concurrency defect, then implement a correct version. Bound concurrency to N, stop early on the first failure, wait for all in-flight work, and return the first error encountered.
Solution
| Defect | Where | Fix | |---|---|---| | Unbounded goroutines | `go upload(f)` per file | Cap with a semaphore (buffered channel) of size N. | | No waiting | returns immediately | Wait on an `errgroup` / `WaitGroup` before returning. | | Errors dropped | `upload` returns nothing usable | Capture and return the first error. | | No cancellation | failures keep uploading | Use a shared context; cancel on first error so the rest stop early. | | No back-pressure | all files dispatched at once | Acquire a semaphore slot before launching each upload. | Idiomatic implementation with `golang.org/x/sync/errgroup`:import (
"context"
"golang.org/x/sync/errgroup"
)
func migrateAll(ctx context.Context, files []string, concurrency int) error {
g, ctx := errgroup.WithContext(ctx) // cancels ctx on the first non-nil error
g.SetLimit(concurrency) // bounds concurrent goroutines to N
for _, f := range files {
f := f // capture loop variable (pre-Go 1.22)
g.Go(func() error {
// Respect cancellation: skip remaining work once one upload has failed.
if err := ctx.Err(); err != nil {
return err
}
return upload(ctx, f)
})
}
// Wait blocks until all started goroutines return; returns the first error.
return g.Wait()
}
func migrateAll(ctx context.Context, files []string, concurrency int) error {
ctx, cancel := context.WithCancel(ctx)
defer cancel()
sem := make(chan struct{}, concurrency) // bound concurrency
var wg sync.WaitGroup
var once sync.Once
var firstErr error
for _, f := range files {
if ctx.Err() != nil {
break // stop dispatching once cancelled
}
sem <- struct{}{} // acquire a slot (back-pressure on the dispatch loop)
wg.Add(1)
go func(f string) {
defer wg.Done()
defer func() { <-sem }() // release the slot
if err := upload(ctx, f); err != nil {
once.Do(func() { // record only the first error, then cancel the rest
firstErr = err
cancel()
})
}
}(f)
}
wg.Wait()
return firstErr
}
Self-Assessment¶
Rate yourself on each. If any answer is "no," revisit the linked task and the chapter README.
- Data races (Task 1, 3): Can you state why a read concurrent with a write is itself a race, and choose between a mutex and message passing to fix it?
- Atomics (Task 2): Do you know when a single-variable atomic beats a mutex, and why re-reading after an atomic update reintroduces a race?
- Thread-safe collections (Task 4): Can you explain why swapping in a concurrent map does not fix a check-then-act race, and use
computeIfAbsentcorrectly? - Critical-section scope (Task 5): Can you narrow a lock to the shortest correct span and articulate why holding a lock across I/O is forbidden?
- Lazy initialization (Task 6, 7): Can you explain the partially-constructed-object hazard and reach for
volatile, the holder idiom, orfunctools.cacheinstead of hand-rolled double-checked locking? - Back-pressure (Task 8): Can you describe how a full bounded queue propagates slowness upstream, and pick between a worker pool and a semaphore?
- Producer-consumer (Task 9): Can you implement it with a bounded queue and shut consumers down with the sentinel idiom, no spinning?
- Cancellation (Task 10): Can you wire a
contextthrough a long-lived goroutine so it stops promptly andmainwaits for a clean exit? - Deadlock (Task 11): Can you break a deadlock with global lock ordering, and explain why
tryLockneeds randomized backoff? - Bounded batch work (Task 12): Can you list the four properties a correct concurrent batch needs and build them from a semaphore,
WaitGroup, and context?
Related Topics¶
- junior.md — concurrency fundamentals: what a data race is, what a lock does, the mental model
- find-bug.md — buggy concurrent snippets where the hazard hides; spot it before reading the fix
- optimize.md — correct-but-slow concurrent code: reducing contention, sharding locks, lock-free reads
- Chapter README — the positive rules of clean concurrency and the anti-patterns to avoid
- Refactoring — restructuring techniques that apply when you extract critical sections into well-named methods
Next: Apply these to real code. The fastest way to internalize concurrency correctness is to run every solution under
go test -race, a Java stress harness, orpython -X devand watch the broken version fail before the fix makes it pass.
In this topic