Mocking Time — Senior Level¶

Table of Contents¶

Introduction
Designing Time-Aware APIs
Schedulers and Cron-Style Tasks
Heartbeat, Lease, and Leader Election
Hybrid Logical Clocks and Distributed Time
Testing Across Goroutine Trees
Combining Fake Clocks with context
Test-Suite Architecture for Time-Heavy Code
Operational Concerns: Real-Time Drift, NTP Jumps, Sleep Anomalies
Anti-Patterns at Scale
Cheat Sheet
Summary

Introduction¶

At middle level you knew the libraries and could write a fake-clock test for a TTL cache. Senior level is where you decide what testable looks like for an entire subsystem — a scheduler, a distributed lease, a replication protocol. You are the person who reviews PRs and says "this is testable" or "rip this out and inject a clock." That requires not just knowing how clockwork.Advance works but having a strong sense of:

What APIs should look like so they remain testable two years later.
Where time crosses goroutine boundaries and how to keep the fake clock unified.
How to model real-world clock anomalies (NTP jumps, sleep, suspend) in tests.
How to keep a 5,000-test suite fast and deterministic when half the tests touch time.

This file is opinionated. The patterns here come from Kubernetes, etcd, CockroachDB, and similar systems where a single time bug can take a region down.

Designing Time-Aware APIs¶

Take a clock; do not take "current time" as a parameter¶

A common middle-level mistake is to write:

func (s *Session) Expired(now time.Time) bool {
    return now.After(s.deadline)
}

It looks injectable. It is not. Now every caller has to remember to pass a consistent now, and the API has lost the ability to schedule its own work (After, NewTimer). The right shape:

type Session struct {
    clock Clock
    // ...
}
func (s *Session) Expired() bool {
    return s.clock.Now().After(s.deadline)
}

A Clock is a long-lived dependency, like a *sql.DB. Inject it once at construction.

Hide the clock from the public API¶

Callers should not have to know your type uses a clock. The exported signature should be the same as the standard library equivalent:

type Cache struct {
    clock Clock // unexported
}

func NewCache(ttl time.Duration, opts ...Option) *Cache { ... } // clock via Option

If your library is well known, exposing WithClock as an option is fine. Forcing every caller to pass a real clock is not.

Functional options for the clock¶

type Option func(*Cache)
func WithClock(c Clock) Option { return func(x *Cache) { x.clock = c } }

func NewCache(ttl time.Duration, opts ...Option) *Cache {
    c := &Cache{clock: clockwork.NewRealClock(), ttl: ttl}
    for _, o := range opts { o(c) }
    return c
}

Production: NewCache(5 * time.Minute). Tests: NewCache(5*time.Minute, WithClock(fc)).

A Clock interface per package vs a shared one¶

For a small library, define a one-method Clock interface in your own package. It documents what your library actually uses and means callers can supply a minimal fake.

For a large application, define one Clock interface (or import clockwork.Clock directly) and use it everywhere. The cost of one big interface is small; the benefit of consistency is large.

Schedulers and Cron-Style Tasks¶

A scheduler is the canonical time-driven system. Cron, Kubernetes CronJobs, GitHub Actions schedules — all of them.

Design¶

type Scheduler struct {
    clock clockwork.Clock
    jobs  []job
}

type job struct {
    name string
    next time.Time
    every time.Duration
    fn   func(context.Context) error
}

func (s *Scheduler) Run(ctx context.Context) error {
    for {
        s.mu.Lock()
        var nextJob *job
        for i := range s.jobs {
            if nextJob == nil || s.jobs[i].next.Before(nextJob.next) {
                nextJob = &s.jobs[i]
            }
        }
        s.mu.Unlock()

        if nextJob == nil {
            return nil
        }
        wait := nextJob.next.Sub(s.clock.Now())
        select {
        case <-ctx.Done():
            return ctx.Err()
        case <-s.clock.After(wait):
            _ = nextJob.fn(ctx)
            nextJob.next = nextJob.next.Add(nextJob.every)
        }
    }
}

Test¶

func TestSchedulerFiresOnSchedule(t *testing.T) {
    fc := clockwork.NewFakeClock()
    s := &Scheduler{clock: fc}
    fires := make(chan string, 10)
    s.AddJob("a", 5*time.Minute, func(ctx context.Context) error {
        fires <- "a"
        return nil
    })

    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()
    go s.Run(ctx)

    for i := 0; i < 3; i++ {
        fc.BlockUntil(1)
        fc.Advance(5 * time.Minute)
        if name := <-fires; name != "a" {
            t.Fatalf("got %q", name)
        }
    }
}

Three fires of a 5-minute job in microseconds of wall time. The same test under real time would take 15 minutes.

Cron expressions¶

Cron expressions like 0 3 * * * (every day at 03:00 UTC) need a parser that takes a starting time.Time and returns the next fire time. Such parsers must accept the time, not call time.Now. The popular robfig/cron/v3 package does this correctly — Schedule.Next(t time.Time) takes an explicit timestamp.

When you write your own cron-style scheduler, follow the same rule: parsing returns a Schedule, scheduling asks Schedule.Next(clock.Now()).

Heartbeat, Lease, and Leader Election¶

A leader election in a distributed system holds a lease for, say, 10 seconds. The leader renews every 3 seconds. If a renewal misses by more than 10 seconds, the lease is considered expired and another node may become leader.

Design¶

type Lease struct {
    clock     Clock
    holder    string
    expireAt  time.Time
    duration  time.Duration
}

func (l *Lease) Renew(now time.Time) {
    l.expireAt = now.Add(l.duration)
}

func (l *Lease) Valid() bool {
    return l.clock.Now().Before(l.expireAt)
}

Test for a missed heartbeat¶

func TestLeaseExpiresOnMissedHeartbeat(t *testing.T) {
    fc := clockwork.NewFakeClock()
    l := &Lease{clock: fc, duration: 10 * time.Second}
    l.Renew(fc.Now())

    fc.Advance(9 * time.Second)
    if !l.Valid() {
        t.Fatal("still within lease")
    }

    fc.Advance(2 * time.Second) // 11s total
    if l.Valid() {
        t.Fatal("lease should have expired at 10s")
    }
}

Test for a renewed lease¶

func TestLeaseRenewExtends(t *testing.T) {
    fc := clockwork.NewFakeClock()
    l := &Lease{clock: fc, duration: 10 * time.Second}
    l.Renew(fc.Now())

    fc.Advance(9 * time.Second)
    l.Renew(fc.Now())            // renew at 9s
    fc.Advance(9 * time.Second)  // total 18s
    if !l.Valid() {
        t.Fatal("should still be valid after renewal")
    }
}

A more complete leader loop¶

type LeaderLoop struct {
    clock          Clock
    renewInterval  time.Duration
    leaseDuration  time.Duration
    onLost         func()
}

func (l *LeaderLoop) Run(ctx context.Context, renew func() error) error {
    ticker := l.clock.NewTicker(l.renewInterval)
    defer ticker.Stop()
    last := l.clock.Now()
    for {
        select {
        case <-ctx.Done():
            return ctx.Err()
        case <-ticker.Chan():
            if err := renew(); err != nil {
                if l.clock.Now().Sub(last) > l.leaseDuration {
                    l.onLost()
                    return err
                }
                continue
            }
            last = l.clock.Now()
        }
    }
}

Testing this needs BlockUntil after every Advance because the ticker re-arms.

Hybrid Logical Clocks and Distributed Time¶

In a distributed system, each node has its own clock. time.Now() on node A differs from node B by tens of milliseconds even under NTP. CockroachDB and many similar systems use a Hybrid Logical Clock (HLC) that combines wall-clock time with a Lamport-style counter to produce globally-monotonic timestamps.

Sketch¶

type HLC struct {
    mu       sync.Mutex
    physical time.Time // last wall-clock reading
    logical  uint32    // counter for ties
    clock    Clock     // injectable
}

func (h *HLC) Now() Timestamp {
    h.mu.Lock()
    defer h.mu.Unlock()
    pt := h.clock.Now()
    if pt.After(h.physical) {
        h.physical = pt
        h.logical = 0
    } else {
        h.logical++
    }
    return Timestamp{Physical: h.physical, Logical: h.logical}
}

func (h *HLC) Update(remote Timestamp) Timestamp {
    h.mu.Lock()
    defer h.mu.Unlock()
    pt := h.clock.Now()
    switch {
    case pt.After(h.physical) && pt.After(remote.Physical):
        h.physical = pt
        h.logical = 0
    case h.physical.After(remote.Physical):
        h.logical++
    case remote.Physical.After(h.physical):
        h.physical = remote.Physical
        h.logical = remote.Logical + 1
    default:
        h.logical = max(h.logical, remote.Logical) + 1
    }
    return Timestamp{Physical: h.physical, Logical: h.logical}
}

Without a Clock, this is untestable. With one, you can model:

Clock skew between nodes (two HLCs with two different fake clocks).
Clock jumps backwards (advance one fake clock by a negative).
Concurrent updates from many nodes (two goroutines, both with BlockUntil).

Test for ordering under skew¶

func TestHLCMonotonicAcrossNodes(t *testing.T) {
    fa := clockwork.NewFakeClockAt(time.Unix(1000, 0))
    fb := clockwork.NewFakeClockAt(time.Unix(990, 0)) // b is 10s behind
    a := &HLC{clock: fa}
    b := &HLC{clock: fb}

    ts := a.Now()
    received := b.Update(ts)
    if received.Compare(ts) <= 0 {
        t.Fatal("HLC violated monotonicity under negative skew")
    }
}

You cannot run this test against real wall clocks. Fake clocks make it routine.

Testing Across Goroutine Trees¶

Real systems have a tree of goroutines: server → handler → worker pool → connection. Each layer may set timers. A fake clock has to drive all of them.

One clock for the tree¶

fc := clockwork.NewFakeClock()
server := NewServer(WithClock(fc))
client := NewClient(WithClock(fc))

Production code constructs all components with the same clock. In a real binary that clock is the real one; in tests, the fake one.

`BlockUntil(n)` for multiple sleepers¶

// 3 goroutines each arm a timer
fc.BlockUntil(3)
fc.Advance(time.Second)

The n is the concurrent number of sleepers, not cumulative. If your code arms-fires-arms in a loop, you still expect n to be the count blocked at one instant.

`synctest.Wait` as the goroutine-tree barrier¶

synctest.Wait is BlockUntil generalised: it waits for the whole bubble to be quiescent, including channel receives and other blocking operations. For complex trees, synctest is friendlier than chasing sleeper counts.

synctest.Run(func() {
    server := NewServer()
    client := NewClient(server.Addr())
    go client.Run()
    time.Sleep(time.Hour) // fake; advances when all goroutines block
    synctest.Wait()
    // assert
})

Goroutines that never block on time¶

If a goroutine spins on for { select { case x := <-ch: ... } }, it is never "sleeping" — so BlockUntil will never count it. Make sure every long-lived goroutine has at least one channel that blocks under normal load.

Combining Fake Clocks with `context`¶

context.WithDeadline and context.WithTimeout use time.Now internally. They are not affected by your fake clock unless you build a Clock-aware variant.

Why this matters¶

ctx, cancel := context.WithTimeout(ctx, 5*time.Minute)
defer cancel()
err := op(ctx) // op uses fake clock, but ctx is real

If op blocks on <-fc.After(time.Hour), advancing the fake clock by an hour does not cancel ctx. The real-time deadline will fire eventually. Mixing fake and real time produces tests that pass slowly.

Solution 1: do not use `context.WithTimeout` in tests under fake clock¶

Provide a CancelOn helper that uses the fake clock:

func CancelOn(ctx context.Context, clock Clock, d time.Duration) (context.Context, context.CancelFunc) {
    ctx, cancel := context.WithCancel(ctx)
    go func() {
        select {
        case <-ctx.Done():
        case <-clock.After(d):
            cancel()
        }
    }()
    return ctx, cancel
}

Now the context's effective deadline is on the fake clock.

Solution 2: use `synctest` so `context.WithTimeout` is also fake¶

Inside synctest.Run, time.Now is fake, so context.WithTimeout is fake. This is one of the killer features of synctest: you get fake context deadlines for free.

Test-Suite Architecture for Time-Heavy Code¶

A project with hundreds of time-dependent tests benefits from convention.

One helper file: `internal/testclock`¶

package testclock

import (
    "testing"
    "github.com/jonboulle/clockwork"
)

func NewFakeClock(t *testing.T) clockwork.FakeClock {
    fc := clockwork.NewFakeClockAt(time.Unix(1_700_000_000, 0))
    t.Cleanup(func() { /* possibly drain sleepers */ })
    return fc
}

Every test calls testclock.NewFakeClock(t) instead of constructing its own. Fixes one bug = fixes it everywhere.

Deterministic starting time¶

NewFakeClockAt(time.Unix(N, 0)) gives every test the same start. Assertions like entry.expireAt == time.Unix(N+TTL, 0) are now trivial.

Parallel tests get their own clocks¶

func TestParallel(t *testing.T) {
    t.Parallel()
    fc := testclock.NewFakeClock(t)
    // ...
}

Shared clocks across parallel tests cause cascading flakes.

Tag time-heavy tests for selective running¶

//go:build timetest

In CI, time-heavy tests run separately. In day-to-day development, they are still fast enough to run with the rest.

Operational Concerns: Real-Time Drift, NTP Jumps, Sleep Anomalies¶

Real clocks misbehave in ways your tests must simulate.

Backwards jumps¶

NTP may step the clock backwards by seconds. Most code does not handle this. Tests can simulate with fc.Advance(-time.Hour) (clockwork allows negative advances). Verify your TTL cache does not return a negative duration that wraps a time.Duration to near MaxInt64.

Forward jumps¶

Same with forward steps of minutes. Verify rate limiters do not give out a million tokens because elapsed appears to be a minute.

Suspend / resume¶

After a laptop wakes from sleep, time.Now() jumps forward by hours. The monotonic clock still increases. time.Since reports correct elapsed monotonic time, but time.Now() minus a stored wall-time reports the jump. Pick the right comparison in production code; test both.

Leap seconds¶

The Go time package does not adjust for leap seconds explicitly. In long-running production they are typically a non-issue, but if your code aggregates over exact second boundaries, write a test that advances over a fake leap-second second.

Time zone changes¶

time.LoadLocation reads OS data; a fake clock has no opinion. Make sure all comparisons use UTC or an explicit location, and test daylight-saving boundaries with appropriate time.Time values, not just durations.

Anti-Patterns at Scale¶

Multiple `Clock` interfaces in one binary¶

pkg/cache.Clock, pkg/limit.Clock, pkg/health.Clock. Each has a slightly different method set. A test now must construct three fakes, all wired to the same fake clock. Either unify on clockwork.Clock or accept the wiring boilerplate.

A fake clock in production through a build tag mistake¶

//go:build !test
var Clock = realClock{}

//go:build test
var Clock = clockwork.NewFakeClock()

If test tag accidentally ships, the production server's auth tokens never expire. Use injection, not tags.

var TestClock clockwork.FakeClock // package-level

func TestX(t *testing.T) {
    TestClock = clockwork.NewFakeClock()
    ...
}

Parallel tests race the global. Use a local per test.

Drift-by-Sleep in CI¶

A test that uses real time.Sleep(100*time.Millisecond) to "give the goroutine a chance to do its thing" passes on a fast laptop, fails on a heavily-loaded CI runner. Use BlockUntil or synctest.Wait.

Asserting "approximately N seconds"¶

if d := fc.Now().Sub(start); d < 9*time.Second || d > 11*time.Second {
    t.Fatal("...")
}

With a fake clock you can assert ==. Approximations only belong in real-time tests.

Cheat Sheet¶

DESIGN:
  - Inject Clock at construction, not per-call
  - Use functional options: NewX(opts ...Option), WithClock
  - One clock per binary; pass it through the goroutine tree
  - context.WithTimeout is real-time unless inside synctest

PATTERNS:
  - Scheduler: select on ticker, BlockUntil after each Advance
  - Heartbeat: Renew(clock.Now()), Valid() compares with clock.Now()
  - HLC: physical = clock.Now(), logical breaks ties

DISTRIBUTED:
  - Multiple fake clocks for multi-node tests
  - Negative Advance for backwards-jumping NTP
  - Forward Advance for laptop wake

TEST SUITE:
  - testclock.NewFakeClock(t) helper
  - NewFakeClockAt(deterministic time)
  - Each parallel test owns its clock
  - synctest.Wait > BlockUntil for goroutine trees

AVOID:
  - Real time.Sleep in tests
  - Multiple Clock interfaces in one binary
  - context.WithTimeout under a fake clock (use CancelOn or synctest)
  - Asserting approximate durations under fake time

Summary¶

At senior level you treat Clock as a foundational dependency: every component that touches time accepts one at construction, the binary creates exactly one in main, and the test suite has a helper that builds a deterministic fake. Schedulers, leases, leader-election, hybrid logical clocks — all of them are testable in a few hundred lines when time is a parameter. The hard parts are not the libraries (clockwork and benbjohnson/clock are both fine and synctest covers the rest) but the architecture: one clock per binary, no shared globals, parallel tests own their clocks, context.WithTimeout lives inside synctest or behind a fake-aware helper. Once that structure is in place, a 5,000-test suite stays fast and never flakes on time-related assertions.

Mocking Time — Senior Level¶

Table of Contents¶

Introduction¶

Designing Time-Aware APIs¶

Take a clock; do not take "current time" as a parameter¶

Hide the clock from the public API¶

Functional options for the clock¶

A Clock interface per package vs a shared one¶

Schedulers and Cron-Style Tasks¶

Design¶

Test¶

Cron expressions¶

Heartbeat, Lease, and Leader Election¶

Design¶

Test for a missed heartbeat¶

Test for a renewed lease¶

A more complete leader loop¶

Hybrid Logical Clocks and Distributed Time¶

Sketch¶

Test for ordering under skew¶

Testing Across Goroutine Trees¶

One clock for the tree¶

BlockUntil(n) for multiple sleepers¶

synctest.Wait as the goroutine-tree barrier¶

Goroutines that never block on time¶

Combining Fake Clocks with context¶

Why this matters¶

Solution 1: do not use context.WithTimeout in tests under fake clock¶

Solution 2: use synctest so context.WithTimeout is also fake¶

Test-Suite Architecture for Time-Heavy Code¶

One helper file: internal/testclock¶

Deterministic starting time¶

Parallel tests get their own clocks¶

Tag time-heavy tests for selective running¶

Operational Concerns: Real-Time Drift, NTP Jumps, Sleep Anomalies¶

Backwards jumps¶

Forward jumps¶

Suspend / resume¶

Leap seconds¶

Time zone changes¶

Anti-Patterns at Scale¶

Multiple Clock interfaces in one binary¶

A fake clock in production through a build tag mistake¶

Sharing the fake clock through a global¶

Drift-by-Sleep in CI¶

Asserting "approximately N seconds"¶

Cheat Sheet¶

Summary¶

`BlockUntil(n)` for multiple sleepers¶

`synctest.Wait` as the goroutine-tree barrier¶

Combining Fake Clocks with `context`¶

Solution 1: do not use `context.WithTimeout` in tests under fake clock¶

Solution 2: use `synctest` so `context.WithTimeout` is also fake¶

One helper file: `internal/testclock`¶

Multiple `Clock` interfaces in one binary¶