sync.Map — Middle Level¶
Table of Contents¶
- Introduction
- The Full API at Middle Level
- Go 1.20 Additions in Depth
- Decision Matrix:
sync.MapvsRWMutex + map - Benchmarks You Should Run Yourself
- Atomic-Update Patterns
- Range Semantics in Detail
- Tracking Size
- TTL and Eviction Wrappers
- Mixed Workloads — When Neither Fits
- Self-Assessment
- Summary
Introduction¶
At junior level you learned the sync.Map API and the rule "read-mostly, stable keys, otherwise default to RWMutex+map." At middle level you start justifying that rule with measurements and choosing the right tool with data. After this file you will:
- Know every method on
sync.Map, including the Go 1.20 additions, in detail. - Reason about why
sync.Mapis slower for write-heavy workloads, with numbers. - Pick between
sync.Map,RWMutex+map, and sharded variants for real scenarios. - Build atomic-update patterns using
CompareAndSwap. - Wrap
sync.Mapwith TTL and size-tracking.
The Full API at Middle Level¶
package sync
type Map struct { /* unexported */ }
func (m *Map) Load(key any) (value any, ok bool)
func (m *Map) Store(key, value any)
func (m *Map) LoadOrStore(key, value any) (actual any, loaded bool)
func (m *Map) LoadAndDelete(key any) (value any, loaded bool)
func (m *Map) Delete(key any)
func (m *Map) Range(f func(key, value any) bool)
// Go 1.20+
func (m *Map) Swap(key, value any) (previous any, loaded bool)
func (m *Map) CompareAndSwap(key, old, new any) (swapped bool)
func (m *Map) CompareAndDelete(key, old any) (deleted bool)
Every method is safe for concurrent use. None return errors; their effects are observable via the ok/loaded/swapped/deleted booleans. The key must be comparable; values can be anything.
LoadOrStore revisited¶
The atomic "set if absent" is more than convenience — it eliminates the classic race:
// Bad: check-then-act
if _, ok := m.Load(k); !ok {
m.Store(k, v) // another goroutine may have stored between
}
// Good
m.LoadOrStore(k, v)
The loaded return tells you whether the entry pre-existed. If you only care about putting something there, you can ignore it. If you care about which value won, use actual.
LoadAndDelete revisited¶
The atomic "take and remove" is the work-handoff primitive. Producers Store; consumers LoadAndDelete. Only one consumer can succeed per entry. Without LoadAndDelete, you would have to lock externally to prevent two consumers from taking the same key.
Go 1.20 Additions in Depth¶
Three methods landed in Go 1.20 that close the longest-standing gap in sync.Map: atomic updates.
Swap — set and return the previous value¶
Atomically replaces the value for key with value and returns the previous one. Like Store but with the old value as a return. Useful when you want to act on the old value while installing a new one.
previous, loaded := registry.Swap(connID, newConn)
if loaded {
previous.(*Conn).Close() // old connection bumped out
}
Before 1.20 you would need a Load followed by a Store, with the racy gap in between.
CompareAndSwap — atomic conditional update¶
Replaces the value only if the current value is equal to old. Returns true on success. The values are compared with ==, the standard Go equality, so old must be a comparable type. Crucially, a struct or pointer with the same shape but different identity may or may not compare equal depending on the type — be precise about what you store.
// Increment-if-current pattern
for {
v, _ := m.Load("hits")
if m.CompareAndSwap("hits", v, v.(int)+1) {
break
}
}
Failure modes: - old does not equal the current value → returns false, no change. - The key does not exist → returns false, no change. CompareAndSwap does not insert.
If you need "insert if absent, atomically update if present," use LoadOrStore first, then CompareAndSwap in a retry loop.
CompareAndDelete — atomic conditional remove¶
Removes the entry only if the current value equals old. Useful for "remove only if I am the most recent writer" patterns:
This eliminates the race where you Load a stale value, decide to delete, and meanwhile another goroutine refreshed it.
Values must be comparable for CAS variants¶
CompareAndSwap and CompareAndDelete compare values with ==. If your values are interface types holding a slice, map, or function, the comparison panics at runtime:
var m sync.Map
m.Store("k", []int{1, 2})
m.CompareAndSwap("k", []int{1, 2}, []int{3, 4})
// panic: runtime error: comparing uncomparable type []int
Pointer values (*Entry) compare by address, which is usually what you want for cache entries.
Decision Matrix: sync.Map vs RWMutex + map¶
The most useful skill at middle level is choosing correctly. Here is the decision table I keep on my desk:
| Workload property | sync.Map | RWMutex + map | Sharded map | Notes |
|---|---|---|---|---|
| Read-mostly, stable keys | Best | OK | OK | The sweet spot. |
| Read-mostly, growing keys | OK | Best | OK | sync.Map pays for dirty rebuilds. |
| Balanced read/write | Slow | Best | Best | sync.Map writes are expensive. |
| Write-heavy | Slow | Good | Best | Contention is the bottleneck; sharding cuts it. |
| Per-key contention low (writes spread) | OK | OK | Best | Sharding shines. |
| Per-key contention high (hot key) | Slow | Slow | Slow | None help; rethink data model. |
| Need atomic counter increment | OK (Go 1.20+) | OK | OK | Or use atomic.Int64 outside the map. |
Need Len() | No | Yes | Yes | sync.Map has no Len. |
| Need ordered iteration | No | Yes | Yes | None of these give order; use a sorted slice. |
| Need atomic snapshot | No | Yes | Hard | RWMutex+map is easiest: RLock, copy keys, RUnlock. |
| Small (< 100 entries) | Slow | Best | Overkill | The mutex is rarely contended at this size. |
| Huge (1M+ entries), churn | Memory amp | OK | Best | sync.Map retains tombstones. |
| Generic value type wanted | No (any) | Yes | Yes | Build a generic wrapper if needed. |
| Cross-goroutine writes rare | Best | OK | OK | sync.Map's other sweet spot. |
The two patterns the Go authors explicitly designed for:
The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys.
If your workload is neither, sync.Map is almost always slower than RWMutex+map.
Benchmarks You Should Run Yourself¶
Here are skeletal benchmarks. Copy them into your project, adjust the workload, and measure.
// bench_test.go
package mapbench_test
import (
"fmt"
"sync"
"testing"
)
const N = 10000
func setupBuiltin() (map[int]int, *sync.RWMutex) {
m := make(map[int]int)
for i := 0; i < N; i++ {
m[i] = i
}
return m, &sync.RWMutex{}
}
func setupSyncMap() *sync.Map {
var m sync.Map
for i := 0; i < N; i++ {
m.Store(i, i)
}
return &m
}
func BenchmarkReadMostly_RWMutex(b *testing.B) {
m, mu := setupBuiltin()
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
mu.RLock()
_ = m[i%N]
mu.RUnlock()
i++
}
})
}
func BenchmarkReadMostly_SyncMap(b *testing.B) {
m := setupSyncMap()
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
_, _ = m.Load(i % N)
i++
}
})
}
func BenchmarkBalanced_RWMutex(b *testing.B) {
m, mu := setupBuiltin()
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
if i%2 == 0 {
mu.RLock()
_ = m[i%N]
mu.RUnlock()
} else {
mu.Lock()
m[i%N] = i
mu.Unlock()
}
i++
}
})
}
func BenchmarkBalanced_SyncMap(b *testing.B) {
m := setupSyncMap()
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
if i%2 == 0 {
_, _ = m.Load(i % N)
} else {
m.Store(i%N, i)
}
i++
}
})
}
func BenchmarkWriteHeavy_RWMutex(b *testing.B) {
m, mu := setupBuiltin()
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
mu.Lock()
m[i%N] = i
mu.Unlock()
i++
}
})
}
func BenchmarkWriteHeavy_SyncMap(b *testing.B) {
m := setupSyncMap()
b.RunParallel(func(pb *testing.PB) {
i := 0
for pb.Next() {
m.Store(i%N, i)
i++
}
})
}
var _ = fmt.Sprintf
Run with:
Indicative results on Apple M2, Go 1.22, 8 cores¶
These are rough numbers — your hardware will differ. Always re-measure.
| Benchmark | -cpu=1 | -cpu=4 | -cpu=8 |
|---|---|---|---|
| ReadMostly_RWMutex | 30 ns/op | 90 ns/op | 200 ns/op |
| ReadMostly_SyncMap | 15 ns/op | 18 ns/op | 22 ns/op |
| Balanced_RWMutex | 50 ns/op | 280 ns/op | 700 ns/op |
| Balanced_SyncMap | 90 ns/op | 220 ns/op | 350 ns/op |
| WriteHeavy_RWMutex | 80 ns/op | 350 ns/op | 900 ns/op |
| WriteHeavy_SyncMap | 180 ns/op | 600 ns/op | 1 500 ns/op |
Lessons:
- Read-mostly with stable keys:
sync.Mapscales beautifully. At 8 cores it is roughly 10× faster thanRWMutex+map, because reads are lock-free. - Balanced workloads:
sync.MapandRWMutex+mapare comparable.sync.Mapmay pull ahead on many cores; the gap is small. - Write-heavy:
RWMutex+mapwins.sync.Map's write path is more expensive per operation. Plus, both contend on a single lock at high concurrency, so consider sharding.
The above numbers reverse if your workload has high churn (many new keys and deletions). sync.Map slows further as the read/dirty rebuild fires repeatedly.
Atomic-Update Patterns¶
Pattern 1: Increment-if-present with CompareAndSwap¶
func increment(m *sync.Map, key any) {
for {
v, ok := m.Load(key)
if !ok {
m.LoadOrStore(key, 1)
return
}
next := v.(int) + 1
if m.CompareAndSwap(key, v, next) {
return
}
}
}
The retry loop handles concurrent updates. If CompareAndSwap fails, another goroutine bumped the value first; we retry with the new current value.
Caveat: for hot counters, this can spin under contention. An atomic.Int64 is faster. Use sync.Map for per-key counters where each key is rarely contended.
Pattern 2: Replace-if-equal for cache invalidation¶
func refreshIfStale(m *sync.Map, key string, stale *Entry) {
fresh := load(key)
m.CompareAndSwap(key, stale, fresh)
}
If another goroutine already refreshed, our CompareAndSwap does nothing. No lost updates, no double-refresh.
Pattern 3: Delete-if-equal for safe eviction¶
Common in TTL eviction: a background goroutine schedules a delete after N seconds, but only if the entry has not been replaced in the meantime.
Pattern 4: Atomic swap with side effect¶
The Swap returns the old value; we close it. This pattern is connection-pool gold.
Range Semantics in Detail¶
The Range callback signature:
Return true to continue, false to stop.
The spec:
Range does not necessarily correspond to any consistent snapshot of the Map's contents: no key will be visited more than once, but if the value for any key is stored or deleted concurrently (including by f), Range may reflect any mapping for that key from any point during the Range call.
Translation:
- Each key existing at the start of
Rangeis visited at most once. - Keys inserted during
Rangemay or may not be visited. - Keys deleted during
Rangemay or may not be visited. - The value observed for a key may be the value at any point during the call.
- Order is unspecified.
Consequences¶
- Don't use
Rangefor "atomic snapshot" of state. UseRWMutex+mapif you need that. - Modifying the map from inside
Rangeis safe but the modifications interact with iteration in implementation-defined ways. Avoid. Rangeruns in O(n) of the visible entries. For large maps with many tombstones (see professional level), the actual work can exceed the apparent count.
Common Range patterns¶
// Collect keys (loosely consistent)
var keys []string
m.Range(func(k, _ any) bool {
keys = append(keys, k.(string))
return true
})
// Filter and collect
var actives []*Conn
m.Range(func(_, v any) bool {
c := v.(*Conn)
if c.Active() {
actives = append(actives, c)
}
return true
})
// Early exit on match
var found *Entry
m.Range(func(_, v any) bool {
e := v.(*Entry)
if e.ID == target {
found = e
return false
}
return true
})
Tracking Size¶
sync.Map has no Len(). The recommended way to track size, if you need it:
type CountedMap struct {
m sync.Map
n int64 // atomic
}
func (c *CountedMap) Store(k, v any) {
if _, loaded := c.m.LoadOrStore(k, v); loaded {
c.m.Store(k, v) // overwrite
} else {
atomic.AddInt64(&c.n, 1)
}
}
func (c *CountedMap) Delete(k any) {
if _, loaded := c.m.LoadAndDelete(k); loaded {
atomic.AddInt64(&c.n, -1)
}
}
func (c *CountedMap) Len() int64 {
return atomic.LoadInt64(&c.n)
}
Notes:
- The
Storepath usesLoadOrStorefirst to detect "was it already there?" then a secondStorefor the overwrite. Two operations, slightly slower. - The
Lenis eventually consistent: between aStoreand the counterAdd, a concurrentLenmay be off by one. Usually acceptable. - If you need precise size, you need a mutex, which defeats the point of
sync.Map. ConsiderRWMutex+mapinstead.
TTL and Eviction Wrappers¶
A common request: "cache values with a TTL." sync.Map does not natively support this; you build it on top:
type ttlEntry struct {
value any
expires time.Time
}
type TTLMap struct {
m sync.Map
}
func (t *TTLMap) Set(key any, value any, ttl time.Duration) {
t.m.Store(key, ttlEntry{value, time.Now().Add(ttl)})
}
func (t *TTLMap) Get(key any) (any, bool) {
v, ok := t.m.Load(key)
if !ok {
return nil, false
}
e := v.(ttlEntry)
if time.Now().After(e.expires) {
t.m.CompareAndDelete(key, e) // safe: only if unchanged
return nil, false
}
return e.value, true
}
The CompareAndDelete ensures we do not evict an entry that was refreshed between our Load and the delete.
For periodic sweep:
func (t *TTLMap) Sweep() {
now := time.Now()
t.m.Range(func(k, v any) bool {
if now.After(v.(ttlEntry).expires) {
t.m.CompareAndDelete(k, v)
}
return true
})
}
Run Sweep from a ticker. Note that high-churn TTL workloads can hurt sync.Map performance due to memory amplification — consider an eviction-aware library like ristretto or freecache for serious caching.
Mixed Workloads — When Neither Fits¶
Two scenarios where the choice is harder:
Scenario A: Bursty writes, otherwise read-mostly¶
Example: a config map that is mostly read, but occasionally a bulk update writes 1 000 entries.
If reads vastly outnumber writes overall, sync.Map still wins despite the burst. But during the burst, throughput drops. If the burst causes user-visible latency, consider an atomic.Pointer[map[K]V] that swaps the entire map atomically on update — readers see no contention at all, writers pay the full rebuild cost.
type Config struct {
m atomic.Pointer[map[string]string]
}
func (c *Config) Get(k string) (string, bool) {
v, ok := (*c.m.Load())[k]
return v, ok
}
func (c *Config) Swap(newMap map[string]string) {
c.m.Store(&newMap)
}
This is the "copy-on-write map" pattern. Reads are a single atomic load plus a map index — even faster than sync.Map. The catch: every write rebuilds the whole map.
Scenario B: Hot keys¶
If a small set of keys gets most of the writes, no map structure helps — your data model is concentrated. Solutions:
- Per-shard atomic counters (
[N]atomic.Int64) indexed by hash. - Move hot state out of the map into a typed struct with
atomic.Pointer. - Aggregate updates in a buffered channel processed by a single goroutine.
The map is not the bottleneck; the contention is. Sharding the map without sharding the access pattern does not help.
Sharded map skeleton¶
const shardCount = 64
type ShardedMap struct {
shards [shardCount]struct {
sync.RWMutex
m map[string]any
}
}
func (s *ShardedMap) shardFor(key string) *struct {
sync.RWMutex
m map[string]any
} {
h := fnv32(key)
return &s.shards[h%shardCount]
}
func (s *ShardedMap) Get(key string) (any, bool) {
sh := s.shardFor(key)
sh.RLock()
v, ok := sh.m[key]
sh.RUnlock()
return v, ok
}
func (s *ShardedMap) Set(key string, value any) {
sh := s.shardFor(key)
sh.Lock()
sh.m[key] = value
sh.Unlock()
}
64 shards means at most 1/64 of operations contend on the same lock. For write-heavy workloads at high concurrency, this typically outperforms both sync.Map and a single RWMutex+map.
Self-Assessment¶
- I can list every method on
sync.Mapfrom memory. - I know what
Swap,CompareAndSwap, andCompareAndDeletedo and what they return. - I can pick between
sync.Map,RWMutex+map, and a sharded map for a given workload. - I can write a benchmark that compares them on my real access pattern.
- I know the
Rangesemantics precisely — not a snapshot, may or may not see concurrent stores. - I can implement an atomic-increment-per-key pattern using
CompareAndSwap. - I know how to track size externally with an
atomic.Int64. - I can build a TTL wrapper using
CompareAndDeletefor safe eviction. - I know when
atomic.Pointer[map]beats bothsync.MapandRWMutex+map. - I know why hot keys are not solvable by sharding alone.
Summary¶
sync.Map is a specialised concurrent map with a small but powerful API. Since Go 1.20 it supports atomic update via Swap, CompareAndSwap, and CompareAndDelete. Its sweet spot — read-mostly with stable keys, or per-goroutine disjoint writes — wins by an order of magnitude over RWMutex+map at high concurrency. Outside that sweet spot, a plain RWMutex+map is faster, simpler, and typed.
The middle-level skill is measurement-driven choice: run the benchmarks, look at your read/write ratio, count your keys, ask whether you need Len or snapshot semantics, and pick deliberately. For write-heavy concentrated workloads, sharding beats both. For bursty config swaps, atomic.Pointer[map] is even faster than sync.Map on the read path.
At senior level we examine deeper trade-offs: generic wrappers, singleflight for load-once, and the limits of all these structures when contention concentrates.