Graceful Shutdown — Specification¶
Table of Contents¶
- Introduction
- Go Language Spec References
os/signalPackage ContractcontextPackage Contractnet/http.Server.ShutdownContract- Kubernetes Pod Lifecycle Specification
- UNIX Signal Specification
- Version Compatibility
- Undefined or Implementation-Specific Behaviour
- References
Introduction¶
This file collects the normative contracts for graceful shutdown in Go. The contracts come from three sources:
- The Go standard library documentation (
pkg.go.dev). - The Kubernetes documentation (
kubernetes.io/docs). - The POSIX/Linux specification for signals.
Where the documentation is ambiguous, this file notes the de facto behaviour observed in the Go source (1.21+). The aim is precision; user-facing patterns are in the level files.
Go Language Spec References¶
The Go language specification (go.dev/ref/spec) does not directly mention graceful shutdown. Relevant indirect concepts:
- Goroutines. "When the function terminates, its goroutine also terminates."
- Channels. "Closing a channel after the last send completes, that lets receivers detect closure."
deferstatement. "A deferred function's invocation is executed immediately before the surrounding function returns."
None of these are shutdown-specific; they are foundational mechanisms used by shutdown patterns.
The spec also defines select semantics (random selection among ready cases) which is critical for shutdown-aware code.
os/signal Package Contract¶
From pkg.go.dev/os/signal:
func Notify(c chan<- os.Signal, sig ...os.Signal)¶
Notify causes package signal to relay incoming signals to c. If no signals are provided, all incoming signals will be relayed to c. Otherwise, just the provided signals will.
Package signal will not block sending to c: the caller must ensure that c has sufficient buffer space to keep up with the expected signal rate.
It is allowed to call Notify multiple times with the same channel: each call expands the set of signals sent to that channel.
Key normative points:
- Signal delivery is non-blocking. A full channel silently drops the signal.
- The caller is responsible for buffer sizing.
- Multiple calls accumulate signals.
func Stop(c chan<- os.Signal)¶
Stop causes package signal to stop relaying incoming signals to c. It undoes the effect of all prior calls to Notify using c.
After Stop, the channel no longer receives signals. The runtime's reference count for the registered signals is decremented; if it drops to zero, the signal's handling reverts to default.
func NotifyContext(parent context.Context, signals ...os.Signal) (ctx context.Context, stop context.CancelFunc)¶
Added in Go 1.16.
NotifyContext returns a copy of the parent context that is marked done (its Done channel is closed) when one of the listed signals arrives, when the returned stop function is called, or when the parent context's Done channel is closed, whichever happens first.
The stop function unregisters the signal behavior, which, like signal.Reset, may restore the default behavior for a given signal.
Normative points:
- The returned context is a derived context.
- It is cancelled on first signal arrival.
stopderegisters and may restore default behaviour.- Multiple signals are coalesced — only first triggers cancellation; the rest are dropped.
func Reset(sig ...os.Signal)¶
Reset undoes the effect of any prior calls to Notify for the provided signals. If no signals are provided, all signal handlers will be reset.
Reset is more drastic than Stop. It resets the process-wide handling.
func Ignored(sig os.Signal) bool¶
Ignored reports whether sig is currently ignored.
context Package Contract¶
From pkg.go.dev/context:
Context interface¶
A Context carries a deadline, a cancellation signal, and other values across API boundaries.
Methods:
Deadline() (deadline time.Time, ok bool)— returns the time after which work should be cancelled.Done() <-chan struct{}— returns a channel that is closed when work should be cancelled.Err() error— returns nil if Done is not yet closed; otherwise, returns a non-nil error.Value(key any) any— returns the value associated with key.
func WithCancel(parent Context) (Context, CancelFunc)¶
WithCancel returns a copy of parent with a new Done channel. The returned context's Done channel is closed when the returned cancel function is called or when the parent context's Done channel is closed, whichever happens first.
Cancel is idempotent. Calling cancel multiple times is safe.
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc)¶
WithTimeout returns WithDeadline(parent, time.Now().Add(timeout)).
The returned context is cancelled when the timeout elapses OR cancel is called OR parent is cancelled, whichever happens first.
func WithDeadline(parent Context, d time.Time) (Context, CancelFunc)¶
Same as WithTimeout but with absolute time.
func WithCancelCause(parent Context) (Context, CancelCauseFunc)¶
Added in Go 1.20.
WithCancelCause behaves like WithCancel but returns a CancelCauseFunc instead of a CancelFunc. Calling cancel with a non-nil error ("the cause") records that error in ctx; it can then be retrieved by calling Cause(ctx).
Useful for richer cancellation diagnostics.
func AfterFunc(ctx Context, f func()) (stop func() bool)¶
Added in Go 1.21.
AfterFunc arranges to call f in its own goroutine after ctx is done.
Useful for "run cleanup when context is cancelled" without spawning a goroutine yourself.
net/http.Server.Shutdown Contract¶
From pkg.go.dev/net/http#Server.Shutdown:
Shutdown gracefully shuts down the server without interrupting any active connections. Shutdown works by first closing all open listeners, then closing all idle connections, and then waiting indefinitely for connections to return to idle and then shut down. If the provided context expires before the shutdown is complete, Shutdown returns the context's error, otherwise it returns any error returned from closing the Server's underlying Listener(s).
When Shutdown is called, Serve, ListenAndServe, and ListenAndServeTLS immediately return ErrServerClosed. Make sure the program doesn't exit and waits instead for Shutdown to return.
Shutdown does not attempt to close nor wait for hijacked connections such as WebSockets. The caller of Shutdown should separately notify such long-lived connections of shutdown and wait for them to close, if desired.
Once Shutdown has been called on a server, it may not be reused; future calls to methods such as Serve will return ErrServerClosed.
Normative points:
- Shutdown closes listeners immediately.
- Idle connections close immediately.
- Active connections are allowed to finish.
- Hijacked connections are NOT tracked or waited for.
ListenAndServereturnshttp.ErrServerClosedafterShutdownis called.- A server cannot be reused after
Shutdown.
func (srv *Server) Close() error¶
Close immediately closes all active net.Listeners and any connections in state StateNew, StateActive, or StateIdle. For a graceful shutdown, use Shutdown.
Close does not attempt to close (and does not even know about) any hijacked connections, such as WebSockets.
Close returns any error returned from closing the Server's underlying Listener(s).
Normative points:
- Close is brutal: interrupts all active connections.
- Hijacked connections are not tracked.
func (srv *Server) RegisterOnShutdown(f func())¶
RegisterOnShutdown registers a function to call on Shutdown. This can be used to gracefully shutdown connections that have undergone NPN/ALPN protocol upgrade or that have been hijacked. This function should start protocol-specific graceful shutdown, but should not wait for shutdown to complete.
Normative point: hooks are fire-and-forget. Shutdown does NOT wait for them.
http.ErrServerClosed¶
Returned by ListenAndServe, ListenAndServeTLS, and Serve after Shutdown or Close.
http.Server.BaseContext and http.Server.ConnContext¶
Added in Go 1.13.
BaseContext func(net.Listener) context.Context
ConnContext func(ctx context.Context, c net.Conn) context.Context
BaseContextprovides the base context for all new requests on a listener.ConnContextmodifies the context for each new connection.
Setting BaseContext to return your application's root context makes all handlers' r.Context() cancellable by your shutdown logic.
Kubernetes Pod Lifecycle Specification¶
From kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/:
Termination of pods¶
Kubernetes will not terminate Pods immediately. Rather, Kubernetes follows a graceful termination sequence that allows Pods to perform cleanup before being terminated.
The sequence:
- The user sends a delete command to the API server.
- The pod is marked Terminating.
- The kubelet detects the change.
- The kubelet runs the
preStophook (if defined). - The kubelet sends SIGTERM to PID 1 in each container.
- The kubelet waits for
terminationGracePeriodSeconds. - If the container has not exited, the kubelet sends SIGKILL.
- The kubelet cleans up.
terminationGracePeriodSeconds¶
Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). If this value is nil, the default grace period will be used instead. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. Defaults to 30 seconds.
Default: 30. Set explicitly for clarity.
preStop hook¶
PreStop is called immediately before a container is terminated due to an API request or management event such as liveness/startup probe failure, preemption, resource contention, etc. The handler is not called if the container crashes or exits.
Hook types:
exec: runs a command in the container.httpGet: makes an HTTP GET request to the container.
The hook's execution time counts against terminationGracePeriodSeconds.
Readiness probe¶
If the readiness probe fails, the endpoints controller removes the Pod's IP address from the endpoints of all Services that match the Pod.
During shutdown, flipping readiness to fail causes the pod to be removed from Service endpoints. Existing connections continue; new requests stop landing on this pod.
Liveness probe¶
Many applications running for long periods of time eventually transition to broken states, and cannot recover except by being restarted. Kubernetes provides liveness probes to detect and remedy such situations.
Liveness should NOT fail during shutdown. Failure triggers a container restart.
UNIX Signal Specification¶
From POSIX (pubs.opengroup.org/onlinepubs/9699919799/):
Signal definitions¶
| Name | Number (typical) | Default Action |
|---|---|---|
| SIGHUP | 1 | Terminate |
| SIGINT | 2 | Terminate |
| SIGQUIT | 3 | Terminate + Core |
| SIGABRT | 6 | Terminate + Core |
| SIGKILL | 9 | Terminate (uncatchable) |
| SIGSEGV | 11 | Terminate + Core |
| SIGPIPE | 13 | Terminate |
| SIGALRM | 14 | Terminate |
| SIGTERM | 15 | Terminate |
| SIGCHLD | 17 | Ignore |
| SIGCONT | 18 | Continue (if stopped) |
| SIGSTOP | 19 | Stop (uncatchable) |
| SIGTSTP | 20 | Stop |
| SIGUSR1 | 30 | Terminate |
| SIGUSR2 | 31 | Terminate |
sigaction(2)¶
The POSIX-specified API for installing signal handlers. Go's runtime uses this.
Realtime signals (SIGRTMIN–SIGRTMAX)¶
Real-time signals are different from standard signals: they can be queued, they carry data, and they are delivered in order of priority.
Go does not expose realtime signals via os/signal. They are reserved for runtime use.
Signal masks¶
Each thread has a signal mask. Signals in the mask are blocked from delivery to that thread.
sigprocmask(2) modifies the mask. Go's runtime sets masks on its threads.
Version Compatibility¶
Go 1.7 and earlier¶
http.Server.Shutdown does not exist. Use third-party packages.
Go 1.8¶
Shutdown added. RegisterOnShutdown added.
Go 1.13¶
BaseContext and ConnContext added to http.Server.
Go 1.14¶
Async preemption via SIGURG. Don't subscribe to SIGURG.
Go 1.16¶
signal.NotifyContext added.
Go 1.20¶
context.WithCancelCause, context.WithDeadlineCause.
Go 1.21¶
context.AfterFunc, more robust scheduler.
Go 1.22¶
Loop variable scoping change. Useful for goroutine-in-loop patterns.
Go 1.23+¶
Iterators (range func). Not directly relevant but useful for shutdown utilities.
Undefined or Implementation-Specific Behaviour¶
A few areas where the spec is loose or implementation-defined:
Polling interval in Shutdown¶
The 500ms upper bound is a constant in net/http. Not specified externally. May change.
Order of OnShutdown callbacks¶
Callbacks run in their own goroutines; order is non-deterministic. The spec does not promise any order.
Signal coalescing¶
The spec says signals "may be coalesced." In practice, standard signals coalesce; realtime signals do not.
Channel ordering in signal.Notify¶
If multiple channels are registered for the same signal, the order they receive the signal is not specified. In practice, it depends on map iteration order in Go.
Hijacked connection lifecycle¶
After Hijack, the spec says the server "doesn't even know about" the connection. No further guarantees.
runtime.NumGoroutine() during shutdown¶
Returns the current count, but the count is a snapshot. By the time you read it, the count may differ.
os.Exit and asynchronous goroutines¶
os.Exit terminates the process immediately. No spec guarantee about what happens to goroutines.
Container PID 1 signal forwarding¶
Not specified by K8s. Depends on the container init. Best practice: ensure your binary is PID 1 or use a forwarding init.
References¶
Go standard library¶
pkg.go.dev/os/signalpkg.go.dev/contextpkg.go.dev/net/httppkg.go.dev/golang.org/x/sync/errgroup
Go source¶
src/os/signal/src/runtime/signal_unix.gosrc/runtime/sigqueue.gosrc/context/context.gosrc/net/http/server.go
Kubernetes¶
kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/kubernetes.io/docs/tasks/configure-pod-container/
POSIX / Linux¶
man 7 signalman 2 sigactionman 2 kill- IEEE Std 1003.1-2017 (POSIX)
Container runtimes¶
containerd.io/docsgithub.com/opencontainers/runcgithub.com/krallin/tinigithub.com/Yelp/dumb-init
Service mesh¶
istio.io/docslinkerd.io/docsenvoyproxy.io/docs
Related Go Roadmap files¶
- Goroutines — foundational concurrency.
- Context — cancellation primitive.
- Channels — synchronisation primitive.
- Production patterns — broader context.
A working knowledge of each reference is the foundation of professional-level mastery.
Appendix: Detailed Function Signatures¶
For quick reference:
// os/signal
func Notify(c chan<- os.Signal, sig ...os.Signal)
func Stop(c chan<- os.Signal)
func NotifyContext(parent context.Context, signals ...os.Signal) (ctx context.Context, stop context.CancelFunc)
func Reset(sig ...os.Signal)
func Ignored(sig os.Signal) bool
// context
type Context interface {
Deadline() (deadline time.Time, ok bool)
Done() <-chan struct{}
Err() error
Value(key any) any
}
func Background() Context
func TODO() Context
func WithCancel(parent Context) (Context, CancelFunc)
func WithCancelCause(parent Context) (Context, CancelCauseFunc)
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc)
func WithDeadline(parent Context, d time.Time) (Context, CancelFunc)
func AfterFunc(ctx Context, f func()) (stop func() bool)
func Cause(c Context) error
// net/http
type Server struct {
Addr string
Handler Handler
TLSConfig *tls.Config
ReadTimeout time.Duration
ReadHeaderTimeout time.Duration
WriteTimeout time.Duration
IdleTimeout time.Duration
MaxHeaderBytes int
ConnState func(net.Conn, ConnState)
BaseContext func(net.Listener) context.Context
ConnContext func(ctx context.Context, c net.Conn) context.Context
ErrorLog *log.Logger
}
func (srv *Server) ListenAndServe() error
func (srv *Server) ListenAndServeTLS(certFile, keyFile string) error
func (srv *Server) Serve(l net.Listener) error
func (srv *Server) ServeTLS(l net.Listener, certFile, keyFile string) error
func (srv *Server) Shutdown(ctx context.Context) error
func (srv *Server) Close() error
func (srv *Server) RegisterOnShutdown(f func())
func (srv *Server) SetKeepAlivesEnabled(v bool)
// errgroup
type Group struct{}
func WithContext(ctx context.Context) (*Group, context.Context)
func (g *Group) Go(f func() error)
func (g *Group) TryGo(f func() error) bool // Go 1.20+
func (g *Group) Wait() error
func (g *Group) SetLimit(n int) // Go 1.20+
These signatures are the bedrock. Memorise them.
Appendix: Signal Constants¶
// from syscall package, on UNIX
const (
SIGABRT = Signal(0x6)
SIGALRM = Signal(0xe)
SIGBUS = Signal(0x7)
SIGCHLD = Signal(0x11)
SIGCONT = Signal(0x12)
SIGFPE = Signal(0x8)
SIGHUP = Signal(0x1)
SIGILL = Signal(0x4)
SIGINT = Signal(0x2)
SIGIO = Signal(0x1d)
SIGIOT = Signal(0x6)
SIGKILL = Signal(0x9)
SIGPIPE = Signal(0xd)
SIGPROF = Signal(0x1b)
SIGPWR = Signal(0x1e)
SIGQUIT = Signal(0x3)
SIGSEGV = Signal(0xb)
SIGSTKFLT = Signal(0x10)
SIGSTOP = Signal(0x13)
SIGSYS = Signal(0x1f)
SIGTERM = Signal(0xf)
SIGTRAP = Signal(0x5)
SIGTSTP = Signal(0x14)
SIGTTIN = Signal(0x15)
SIGTTOU = Signal(0x16)
SIGURG = Signal(0x17)
SIGUSR1 = Signal(0xa) // varies by platform
SIGUSR2 = Signal(0xc)
SIGVTALRM = Signal(0x1a)
SIGWINCH = Signal(0x1c)
SIGXCPU = Signal(0x18)
SIGXFSZ = Signal(0x19)
)
For graceful shutdown, the relevant constants are SIGINT, SIGTERM, and SIGHUP.
Appendix: Behaviour Matrix¶
A reference table for each Go feature:
| Feature | Go 1.7 | 1.8 | 1.13 | 1.14 | 1.16 | 1.20 | 1.21 |
|---|---|---|---|---|---|---|---|
http.Server.Shutdown | - | yes | yes | yes | yes | yes | yes |
RegisterOnShutdown | - | yes | yes | yes | yes | yes | yes |
BaseContext | - | - | yes | yes | yes | yes | yes |
ConnContext | - | - | yes | yes | yes | yes | yes |
| Async preemption | - | - | - | yes | yes | yes | yes |
signal.NotifyContext | - | - | - | - | yes | yes | yes |
WithCancelCause | - | - | - | - | - | yes | yes |
AfterFunc | - | - | - | - | - | - | yes |
errgroup.SetLimit | - | - | - | - | - | yes | yes |
errgroup.TryGo | - | - | - | - | - | yes | yes |
A service targeting 1.21+ has the cleanest API. Older versions still work but require more boilerplate.
Appendix: Container Runtime Compatibility¶
Common container runtimes and their support for graceful behaviour:
| Runtime | preStop | SIGTERM forwarding | Notes |
|---|---|---|---|
| containerd | yes | depends on PID 1 | Most common in K8s |
| CRI-O | yes | depends on PID 1 | RHEL/OpenShift default |
| Docker (with shim) | yes | depends on PID 1 | Legacy |
| podman | yes | depends on PID 1 | Daemonless alternative |
| gVisor | yes | yes | Sandboxed; extra latency |
| Kata | yes | yes | VM-based; extra latency |
All support the basic K8s lifecycle hooks. Signal forwarding always depends on container PID 1.
Appendix: Error Types¶
Errors specific to graceful shutdown:
http.ErrServerClosed: returned byListenAndServeafterShutdown/Close. Always check witherrors.Is.context.Canceled: returned byctx.Err()aftercancel().context.DeadlineExceeded: returned byctx.Err()after deadline passes.
errors.Is(err, context.DeadlineExceeded) // shutdown deadline reached
errors.Is(err, http.ErrServerClosed) // normal shutdown
errors.Is(err, context.Canceled) // explicit cancel
Appendix: API Stability Notes¶
These APIs are stable and unlikely to change:
os/signal.Notify,Stopos/signal.NotifyContext(since 1.16)context.WithCancel,WithTimeout,WithDeadlinehttp.Server.Shutdown,Close,RegisterOnShutdownerrgroup.Group.Go,Wait
These are evolving:
context.WithCancelCause(since 1.20)context.AfterFunc(since 1.21)errgroup.SetLimit(since 1.20)
Stable APIs are safe to depend on. Evolving APIs may gain features but should not break.
Appendix: A Final Reading Guide¶
For deepest understanding, in order:
pkg.go.dev/os/signal— start here.pkg.go.dev/context— the cancellation primitive.pkg.go.dev/net/http#Server— the HTTP server.kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/— K8s lifecycle.man 7 signal(UNIX) — signal mechanics.- The Go source (paths listed above) — the implementation.
A weekend reading these — and the level files — leaves you with comprehensive command of the topic.