Concurrent Trees — Find the Bug¶
For each snippet, identify the bug, describe its effect, and propose a fix. Answers at the bottom.
Bug 1: Forgotten Unlock¶
type Store struct {
mu sync.Mutex
bt *btree.BTreeG[Item]
}
func (s *Store) Set(it Item) {
s.mu.Lock()
s.bt.ReplaceOrInsert(it)
return
}
Bug 2: RLock for write¶
Bug 3: Lock upgrade¶
func (s *Store) Update(k int, fn func(int) int) {
s.mu.RLock()
cur, _ := s.bt.Get(Item{Key: k})
new := fn(cur.Value)
s.mu.RUnlock()
s.mu.Lock()
s.bt.ReplaceOrInsert(Item{Key: k, Value: new})
s.mu.Unlock()
}
Bug 4: Callback re-entry¶
func (s *Store) PrintAll(fn func(Item)) {
s.mu.RLock()
defer s.mu.RUnlock()
s.bt.Ascend(func(it Item) bool {
fn(it)
return true
})
}
// Caller:
s.PrintAll(func(it Item) {
s.Set(Item{Key: it.Key, Value: it.Value + 1})
})
Bug 5: Snapshot race¶
type Store struct {
mu sync.Mutex
live *btree.BTreeG[Item]
snapshot atomic.Pointer[btree.BTreeG[Item]]
}
func (s *Store) Set(it Item) {
s.live.ReplaceOrInsert(it)
s.snapshot.Store(s.live.Copy())
}
Bug 6: Mutex-in-goroutine¶
func (s *Store) BulkSet(items []Item) {
var wg sync.WaitGroup
for _, it := range items {
wg.Add(1)
go func(it Item) {
defer wg.Done()
s.mu.Lock()
s.bt.ReplaceOrInsert(it)
s.mu.Unlock()
}(it)
}
wg.Wait()
}
(Hint: think about throughput, not correctness.)
Bug 7: Iteration during mutation¶
func (s *Store) CleanUp() {
s.mu.Lock()
defer s.mu.Unlock()
s.bt.Ascend(func(it Item) bool {
if it.Value < 0 {
s.bt.Delete(it)
}
return true
})
}
Bug 8: Captured loop variable¶
func (s *Store) SetAll(keys []int) {
for _, k := range keys {
go func() {
s.mu.Lock()
s.bt.ReplaceOrInsert(Item{Key: k})
s.mu.Unlock()
}()
}
}
(Hint: pre-Go-1.22 issue. Verify with go vet.)
Bug 9: Get without bool check¶
func (s *Store) Multiply(k int, factor int) {
s.mu.Lock()
defer s.mu.Unlock()
it, _ := s.bt.Get(Item{Key: k})
s.bt.ReplaceOrInsert(Item{Key: k, Value: it.Value * factor})
}
Bug 10: Hand-over-hand release order¶
func (t *Tree) Get(k int) (string, bool) {
n := t.root
n.mu.RLock()
for n != nil {
if k == n.key {
v := n.value
n.mu.RUnlock()
return v, true
}
next := n.left
if k > n.key {
next = n.right
}
n.mu.RUnlock()
if next == nil {
return "", false
}
next.mu.RLock()
n = next
}
return "", false
}
Bug 11: Less function panic¶
Bug 12: Mutation of item in tree¶
func (s *Store) IncrementAll() {
s.mu.Lock()
defer s.mu.Unlock()
s.bt.Ascend(func(it Item) bool {
it.Value++ // mutates a copy, not the tree
return true
})
}
Bug 13: Publish/subscribe double-Copy¶
func (s *Store) Set(it Item) {
s.mu.Lock()
defer s.mu.Unlock()
s.snapshot.Store(s.live.Copy())
s.live.ReplaceOrInsert(it)
}
(Order matters.)
Bug 14: Lock order deadlock¶
type Store struct {
aMu sync.Mutex
bMu sync.Mutex
}
func (s *Store) Transfer(from, to int) {
s.aMu.Lock()
s.bMu.Lock()
// ... do work
s.bMu.Unlock()
s.aMu.Unlock()
}
// Concurrent calls to s.Transfer in opposite directions deadlock.
Bug 15: OCC version not bumped¶
func (n *Node) Write(key, val int) {
n.mu.Lock()
n.items[key] = val
// Forgot to increment version!
n.mu.Unlock()
}
Bug 16: Range query with mutating callback¶
func (s *Store) DeleteOlderThan(cutoff int64) {
s.mu.Lock()
defer s.mu.Unlock()
s.bt.Ascend(func(it Item) bool {
if it.Timestamp < cutoff {
s.bt.Delete(it) // mutating during iteration
}
return true
})
}
Bug 17: Snapshot leak¶
type Store struct {
snapshot atomic.Pointer[btree.BTreeG[Item]]
lastSnap *btree.BTreeG[Item] // remember the last one
}
func (s *Store) Set(it Item) {
// ... mutate live, publish snapshot
s.lastSnap = s.snapshot.Load() // pin the old one
}
Bug 18: Channel-based serialization deadlock¶
type Store struct {
cmds chan func()
}
func (s *Store) Run() {
for cmd := range s.cmds {
cmd()
}
}
func (s *Store) Set(it Item) {
done := make(chan struct{})
s.cmds <- func() {
s.bt.ReplaceOrInsert(it)
close(done)
}
<-done
}
// What if Set is called from inside a cmd?
Bug 19: Atomic pointer with non-power-of-2 alignment¶
type Store struct {
counter int32
snapshot atomic.Pointer[btree.BTreeG[Item]] // may not be 8-byte aligned on 32-bit ARM
}
(Issue on 32-bit ARM.)
Bug 20: GC of in-flight snapshot¶
func (s *Store) snapshot() *btree.BTreeG[Item] {
s.mu.Lock()
defer s.mu.Unlock()
return s.live.Copy()
}
func useSnapshot() {
s := snapshot()
runtime.KeepAlive(nil) // explicit GC trigger somehow
// ... use s
}
(More of a misunderstanding than a real bug.)
Answers¶
Bug 1 answer¶
Missing Unlock. The lock is held forever after first call. Fix:
Bug 2 answer¶
RLock allows concurrent readers; calling a mutating method while only holding RLock races with other readers and writers. Use Lock:
Bug 3 answer¶
Between RUnlock and Lock, another goroutine can update the same key, and our fn(cur.Value) is now stale. Fix: take Lock for the whole operation.
Bug 4 answer¶
The callback fn calls s.Set, which tries to take the write lock. We hold the read lock. Deadlock. Fix: materialize into a slice under the read lock, then call fn outside:
func (s *Store) PrintAll(fn func(Item)) {
s.mu.RLock()
var snap []Item
s.bt.Ascend(func(it Item) bool { snap = append(snap, it); return true })
s.mu.RUnlock()
for _, it := range snap {
fn(it)
}
}
Bug 5 answer¶
Set does not take mu, so live is accessed concurrently. Fix:
func (s *Store) Set(it Item) {
s.mu.Lock()
defer s.mu.Unlock()
s.live.ReplaceOrInsert(it)
s.snapshot.Store(s.live.Copy())
}
Bug 6 answer¶
Correct but slow: every item spawns a goroutine that immediately contends for the same lock. Better:
func (s *Store) BulkSet(items []Item) {
s.mu.Lock()
defer s.mu.Unlock()
for _, it := range items {
s.bt.ReplaceOrInsert(it)
}
}
Bug 7 answer¶
Mutating the tree during iteration. The B-tree library's behavior is undefined; may skip items, double-visit, or panic. Fix: collect keys to delete, then delete after iteration:
func (s *Store) CleanUp() {
s.mu.Lock()
defer s.mu.Unlock()
var toDelete []Item
s.bt.Ascend(func(it Item) bool {
if it.Value < 0 {
toDelete = append(toDelete, it)
}
return true
})
for _, it := range toDelete {
s.bt.Delete(it)
}
}
Bug 8 answer¶
In Go before 1.22, k is shared by all goroutines, so they all see the last value of k. Fix (pre-1.22):
Or, on Go 1.22+, the loop variable is per-iteration; bug is fixed automatically.
Bug 9 answer¶
If k is not in the tree, it is zero-valued (Item{Key: 0, Value: 0}). We then insert Item{Key: k, Value: 0} — losing whatever was there (none) and inserting a zero. Probably wrong. Fix:
if it, ok := s.bt.Get(Item{Key: k}); ok {
s.bt.ReplaceOrInsert(Item{Key: k, Value: it.Value * factor})
}
Bug 10 answer¶
Releases parent before locking child. Another goroutine can mutate or delete the child between. Fix: lock child before unlocking parent.
Bug 11 answer¶
Value may be of type that compares with < validly only if non-nil. Make less total:
func less(a, b Item) bool {
if a.Value == nil { return true }
if b.Value == nil { return false }
return a.Value < b.Value
}
Or reject nil at insertion.
Bug 12 answer¶
it is a copy; mutating it does not change the tree. Fix: re-insert the modified item.
But wait — that's mutation during iteration (Bug 7). Better: collect into a slice, then re-insert.
Bug 13 answer¶
The snapshot is published before the mutation, so it does not contain the new item. Fix: mutate first, then publish.
Bug 14 answer¶
Two goroutines calling Transfer(1, 2) and Transfer(2, 1) simultaneously may each hold one of the locks waiting for the other. Fix: always acquire locks in a fixed order (e.g., by ID).
Bug 15 answer¶
OCC readers will never detect the change. Fix: bracket the write with version increments.
Bug 16 answer¶
Same as Bug 7. Fix the same way.
Bug 17 answer¶
s.lastSnap keeps the old snapshot alive forever; memory grows. Fix: don't pin the snapshot in a global. If you need historical snapshots, document the lifecycle.
Bug 18 answer¶
If Set is called from inside a cmd handler, the goroutine sends a new cmd into the channel, then blocks waiting on done. The cmds processor is also blocked (it's running the cmd). Deadlock. Fix: detect re-entry and handle directly, or use a goroutine pool.
Bug 19 answer¶
On 32-bit ARM, 64-bit atomic operations require 8-byte alignment. atomic.Pointer may not be aligned if preceded by a non-8-byte field. Fix: put 8-byte aligned fields first, or use //go:align 8.
Bug 20 answer¶
runtime.KeepAlive does the opposite: it ensures nil is kept alive (nil is always alive). Useless here. The snapshot returned from s.live.Copy() is captured by s, so it persists as long as s references it. No actual GC bug.
Closing¶
Twenty bugs spanning common concurrent tree mistakes. Practice spotting them:
- Forgotten Unlock.
- RLock for writes.
- Lock upgrades (release-then-acquire races).
- Callback re-entry deadlocks.
- Snapshot publication order.
- Mutation during iteration.
- Captured loop variables (pre-1.22).
- Get without bool check.
- Hand-over-hand release order.
- OCC version not bumped.
- Snapshot leaks.
- Deadlocks from inconsistent lock order.
Run go vet and go test -race on every concurrent tree. They catch many of these automatically.