net/http Server Concurrency — Junior Level¶

Table of Contents¶

Introduction
Prerequisites
Glossary
Core Concepts
Your first server
What is a goroutine-per-connection model
Lifecycle of one HTTP request
Handler runs on its own goroutine
Many requests at the same time
Persistent connections (keep-alive)
HTTP/2 in one paragraph
ResponseWriter is not safe for concurrent use
Request bodies and concurrency
r.Context() and cancellation
Spawning goroutines from handlers
Race conditions in handlers
Shared state across handlers
The race detector
Graceful shutdown for beginners
Server timeouts in plain terms
Mental models
Coding patterns
Clean code
Error handling
Security considerations
Performance tips
Best practices
Edge cases and pitfalls
Common mistakes
Common misconceptions
Tricky points
Test
Tricky questions
Cheat sheet
Self-assessment checklist
Summary
What you can build
Further reading
Related topics

Introduction¶

Focus: what happens between http.ListenAndServe and your handler running, who runs your handler, and what concurrency rules you must respect.

The Go HTTP server makes a striking promise. You write one function — func(w http.ResponseWriter, r *http.Request) — and the standard library serves it concurrently to as many clients as the operating system can hand it connections. You do not start any goroutines. You do not manage any thread pools. You do not poll any sockets. And yet thousands of clients can talk to your server at the same time, each receiving correct responses.

This file is about how that promise is implemented and what it asks of you in return. The trick is short: every new TCP connection becomes a new goroutine, and inside that goroutine the server reads the HTTP request, calls your handler, and writes the response, one request at a time. There is no thread pool. There is no async/await ceremony. A connection is a goroutine, and a goroutine is cheap.

But that simplicity hides a few rules. Your handler runs on a goroutine you didn't create. Other handlers, serving other clients, run on other goroutines you didn't create. If they share any state — a counter, a map, a buffer, a struct — that sharing is concurrent and unprotected by default. The same race-detector warnings, the same data-race rules, the same sync.Mutex discipline that you learn in basic Go concurrency apply directly inside HTTP handlers.

This file teaches you, at the junior level: 1. The goroutine-per-connection model — what it means, what it costs, what it gives you. 2. The lifecycle of one HTTP request, from Accept to your handler returning. 3. How keep-alive turns one connection into many requests, sharing one goroutine. 4. What http.ResponseWriter will and won't tolerate from your code. 5. What r.Context() is for, and why ignoring it leaks goroutines. 6. The most common concurrency bugs a junior programmer writes in handlers, and how to fix them.

You will not yet learn how (*conn).serve is written line by line (middle.md) or how HTTP/2 multiplexes streams over one connection with one framer goroutine and one writer goroutine (senior.md). You will learn the user-visible concurrency model: what to assume, what to avoid, what to test.

By the end you should be able to write a server that does not corrupt responses under concurrent clients, does not leak goroutines when clients disconnect, and shuts down gracefully on SIGINT.

Prerequisites¶

Required. You can write and run Go programs.
Required. You have used go func() and chan at least once.
Required. You understand that goroutines run concurrently and that shared variables can race.
Helpful. You have built a tiny HTTP client with http.Get and read a response.
Helpful. You have written a sync.Mutex-protected counter.

You do not need to know: - Anything about HTTP/2 framing, settings, or flow control. - The Go memory model in detail. - How the runtime scheduler picks which OS thread runs your goroutine.

A working installation of Go 1.21 or later is assumed. Most examples work on any 1.20+ version, but slog and atomic.Int64 style examples need 1.19+.

Glossary¶

Term	Definition
Goroutine	A lightweight thread managed by the Go runtime. Cheap to create (~2 KB initial stack) and switch. Many goroutines run on a small number of OS threads.
net.Listener	An object that represents a server socket. Created by `net.Listen("tcp", ":8080")`. Its `Accept` method blocks until a new TCP client connects.
Accept loop	The `for { conn, _ := ln.Accept(); go serve(conn) }` pattern that turns one server socket into a stream of per-connection goroutines.
*`http.Server`**	The Go struct that holds server configuration (timeouts, handlers, TLS) and runs the accept loop. `http.ListenAndServe` creates one internally.
`http.Handler`	An interface with one method: `ServeHTTP(ResponseWriter, *Request)`. Everything serving HTTP in Go ultimately implements this.
`http.HandlerFunc`	An adapter making a plain function `func(w, r)` satisfy `http.Handler`.
`http.ResponseWriter`	The interface a handler uses to send the response. NOT safe for concurrent use.
*`http.Request`**	The parsed incoming request: method, URL, headers, body. Read-only from the handler's perspective (don't mutate).
Keep-alive	HTTP/1.1 default behaviour where one TCP connection serves many sequential requests.
Hijack	Calling `w.(http.Hijacker).Hijack()` to take over the raw TCP connection from the server. Used for WebSockets, custom protocols.
`r.Context()`	A `context.Context` tied to the request's lifetime. Cancelled when the client disconnects or the handler returns.
Graceful shutdown	Stopping the server such that in-flight requests are allowed to finish. `Server.Shutdown(ctx)`.
Forceful close	Stopping the server immediately, interrupting any in-flight requests. `Server.Close()`.
Race condition	A bug that happens when two goroutines access the same memory and at least one is a write, without synchronisation.
`-race` flag	`go run -race` or `go test -race`: enables the race detector, which instruments memory accesses and reports races at runtime.
HTTP/1.1	Text-based HTTP. One request at a time per connection (with keep-alive, many sequential).
HTTP/2	Binary, multiplexed HTTP. Many concurrent streams per connection. Used over TLS by default.

Core Concepts¶

One connection = one goroutine¶

The single most important sentence in this file:

Every accepted TCP connection becomes one new goroutine. Your handler runs on that goroutine.

When you write:

http.ListenAndServe(":8080", handler)

the Go standard library does this, conceptually:

ln, _ := net.Listen("tcp", ":8080")
for {
    conn, _ := ln.Accept()
    go func() {
        // read the request from conn
        // call handler(w, r)
        // write the response to conn
        // maybe loop for keep-alive
        // close conn
    }()
}

This is the goroutine-per-connection model. It is the standard pattern for Go network servers. It is what makes the API so simple: you write a synchronous handler; the goroutine is your concurrency.

The handler is synchronous from your perspective¶

Inside a handler, code runs top to bottom. You don't have to think about callbacks, promises, or futures. You read the request body. You compute. You write the response. You return.

func handler(w http.ResponseWriter, r *http.Request) {
    name := r.URL.Query().Get("name")
    fmt.Fprintf(w, "hello, %s\n", name)
}

This handler is plain sequential code. It just happens to be running on a goroutine spawned by the server, simultaneously with many other instances of itself running for other clients.

Concurrency comes from other goroutines, not your handler¶

Your one call to your handler is sequential. But there are many concurrent calls to your handler, one per client. They run in parallel.

This is the key concurrency property:

Each call to your handler runs serially. Different calls to your handler run concurrently with each other.

Anything inside one call is safe in isolation. Anything shared across calls — a global variable, a heap-allocated struct, a database connection — must be protected.

The server starts goroutines, never destroys them on its own¶

Once (*conn).serve is running, the server itself doesn't stop it; the goroutine ends when the handler returns AND the connection closes (for keep-alive, after all requests complete). If your handler blocks forever, that goroutine blocks forever. The server has no "kill thread" mechanism.

This is why honouring r.Context() matters: it is the signal that the server gives your handler to ask for early return.

Your first server¶

Let's write a server, run it, and observe the goroutines.

package main

import (
    "fmt"
    "net/http"
    "runtime"
)

func handler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "Hello from goroutine count: %d\n", runtime.NumGoroutine())
}

func main() {
    http.HandleFunc("/", handler)
    http.ListenAndServe(":8080", nil)
}

Run it:

go run main.go

In another terminal:

$ curl http://localhost:8080/
Hello from goroutine count: 4
$ curl http://localhost:8080/
Hello from goroutine count: 4

The number is small (a handful of runtime goroutines plus the accept loop's). Why doesn't it grow? Because each curl is a one-shot HTTP/1.1 request: the client opens a connection, sends one request, reads the response, closes. By the time the response is returned, the server-side goroutine has already moved on or exited.

Now let's keep a connection open. Use curl --keepalive-time 60 --next style or just write a Go client:

package main

import (
    "fmt"
    "io"
    "net/http"
    "time"
)

func main() {
    c := &http.Client{}
    for i := 0; i < 5; i++ {
        resp, _ := c.Get("http://localhost:8080/")
        body, _ := io.ReadAll(resp.Body)
        resp.Body.Close()
        fmt.Print(string(body))
        time.Sleep(time.Second)
    }
}

The Go HTTP client reuses connections by default. The server sees five requests on one connection — same goroutine. The reported runtime.NumGoroutine() should stay constant.

What the server did under the hood¶

For your benefit, here is the simplified flow that the standard library implemented behind http.ListenAndServe:

// roughly equivalent to ListenAndServe(":8080", nil)

ln, _ := net.Listen("tcp", ":8080")
srv := &http.Server{Handler: http.DefaultServeMux}

// the accept loop
for {
    conn, err := ln.Accept()
    if err != nil { /* shutdown handling */ }
    c := srv.newConn(conn)
    go c.serve(ctx)
}

Where c.serve(ctx) is, roughly:

func (c *conn) serve(ctx context.Context) {
    defer recover-and-cleanup()

    for { // keep-alive loop
        w, err := c.readRequest(ctx)
        if err != nil { return }
        c.server.Handler.ServeHTTP(w, w.req)
        w.finishRequest()
        if !c.canReuse() { return }
    }
}

So one TCP connection gives you one goroutine, and that goroutine handles many requests sequentially.

What is a goroutine-per-connection model¶

The phrase "goroutine-per-connection" deserves unpacking because it implies several things at once.

One: a goroutine is created per accepted connection¶

The accept loop creates a new goroutine each time Accept() returns a connection. The goroutine does not pre-exist. It is not pulled from a pool. It is go newGoroutineFunc().

This is cheap. A goroutine starts with a 2 KB stack (which grows on demand) and a tiny runtime descriptor. Creating one is ~hundreds of nanoseconds. The runtime handles thousands per second easily.

Two: that goroutine owns the connection¶

The connection is not shared. The accept-loop goroutine never reads from or writes to it again. The serving goroutine has exclusive use of the net.Conn.

This is why you don't need locks around bufio.Reader and bufio.Writer that wrap the connection — only one goroutine touches them.

Three: the goroutine ends when the connection ends¶

When (*conn).serve returns — because the client closed, because keep-alive expired, because of an error — its goroutine ends. The conn is closed. The OS releases the file descriptor.

Four: there is no maximum¶

By default, the server accepts as many connections as the OS lets it accept. If a client opens 100,000 connections, the server spawns 100,000 goroutines. Each takes ~10-20 KB after a typical handler. 100k goroutines = ~1-2 GB. That's why production servers cap connection counts (see professional.md).

Five: HTTP/2 changes the rule¶

For HTTP/2, the rule is one goroutine per connection plus one per stream. Many streams per connection, each with its own handler goroutine. We'll cover this in detail in senior.md.

For HTTP/1.1 (the common case at this level), one goroutine handles one connection — and many requests on it if keep-alive is on.

Compare: thread-per-connection vs goroutine-per-connection¶

In C or Java, "thread per connection" is impractical because OS threads cost ~1 MB of stack each. You move to thread pools, event loops (epoll, kqueue), or async runtimes.

In Go, goroutines are managed by the runtime, multiplexed onto a small number of OS threads via the M:N scheduler. A blocked goroutine doesn't block its OS thread (the scheduler parks it and picks another). Goroutines on idle network reads are entirely off-CPU.

So Go's "naive" goroutine-per-connection model gives you the programming model of thread-per-connection but the runtime cost of event-driven I/O.

Lifecycle of one HTTP request¶

Let's walk through what happens for one HTTP/1.1 request, from the client's connect() to the handler's return.

Step 1: Listener accept¶

The accept-loop goroutine calls ln.Accept(). This blocks on accept() syscall (which, under the hood, parks the goroutine until the kernel says a new connection is ready). When a client connects:

conn, _ := ln.Accept()

conn is a net.Conn backed by a TCP socket file descriptor.

Step 2: Spawn the per-connection goroutine¶

c := srv.newConn(conn)
go c.serve(ctx)

newConn wraps the raw conn in a *conn struct holding bufio.Reader and bufio.Writer, request-parsing state, hijack flag, etc. Then a new goroutine starts running c.serve(ctx).

Step 3: TLS handshake (if HTTPS)¶

Inside c.serve, if the listener is TLS:

tlsConn := tls.Server(c.rwc, srv.TLSConfig)
if err := tlsConn.HandshakeContext(ctx); err != nil { return }

This runs synchronously on the per-connection goroutine. Handshake takes 1-2 RTTs.

Step 4: Read the request¶

req, err := http.ReadRequest(c.bufr)

ReadRequest parses the request line ("GET /foo HTTP/1.1") and headers. The body is left as an io.Reader that the handler can read from.

If headers haven't fully arrived, the read blocks. This is when ReadHeaderTimeout fires.

Step 5: Build ResponseWriter and Request¶

The server constructs an *http.response (the internal type implementing http.ResponseWriter) and a fully-populated *http.Request.

Step 6: Start the cancel-context goroutine¶

The server starts a background goroutine that watches the connection for a remote-close (peer sent FIN) and, if it occurs, cancels r.Context(). This is how r.Context().Done() becomes signalled when the client gives up.

Step 7: Call your handler¶

srv.Handler.ServeHTTP(w, req)

This is your code running. You read, compute, write.

Step 8: Finish the response¶

When ServeHTTP returns, the server's finishRequest: - Flushes any buffered output (bufio.Writer.Flush). - Reads and discards any remaining body bytes (so the next request can be parsed). - Writes the chunked-encoding trailer if applicable. - Updates per-connection state.

Step 9: Keep-alive decision¶

The server checks: was this a Connection: close request? Was there an error? Is the client HTTP/1.0? Based on this, either: - Loop back to step 4 to read the next request on the same connection. - Or close the connection.

Step 10: Connection close (eventually)¶

When the loop exits, the goroutine Closes the conn, runs ConnState callback with StateClosed, and returns. Goroutine ends.

Handler runs on its own goroutine¶

Now the key fact for application code:

Your handler(w, r) function runs on a goroutine that was spawned by the server. You did not start it.

This has practical implications.

You can use blocking operations freely¶

Inside a handler you can: - Read a file synchronously. - Run a time.Sleep. - Make an HTTP request to another server. - Query a database.

You will not block other requests. Other requests are on other goroutines.

func slowHandler(w http.ResponseWriter, r *http.Request) {
    time.Sleep(5 * time.Second)
    w.Write([]byte("done\n"))
}

This works fine even with thousands of concurrent clients. Each sleep is one goroutine. Go's scheduler parks sleeping goroutines off-CPU.

You can also block forever (bug!)¶

Because the server doesn't kill your goroutine, if you write:

func badHandler(w http.ResponseWriter, r *http.Request) {
    select {} // block forever
}

That goroutine never exits. The connection is held open forever (no WriteTimeout set by default). Memory and file descriptors accumulate. This is the "goroutine leak."

What goroutine ID do handlers run on?¶

If you do:

func handler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "gid=%d\n", gid())
}

(where gid() extracts the goroutine ID from runtime.Stack), then: - One curl: gid = some N (depends on server activity). - Another curl (separate connection): different gid. - Two curls with keep-alive (same connection, sequential): SAME gid.

This is observable proof that one connection = one goroutine, but the value of the goroutine ID is an implementation detail. Don't use goroutine IDs in production code.

Many requests at the same time¶

Let's say 100 clients hit your server simultaneously. What happens?

client 1 -> Accept -> goroutine 100, handler running
client 2 -> Accept -> goroutine 101, handler running
client 3 -> Accept -> goroutine 102, handler running
...
client 100 -> Accept -> goroutine 199, handler running

100 goroutines, each running handler(w, r) in parallel. They don't interact unless they share state.

Shared state must be protected¶

If your handler reads or writes a package-level variable, 100 concurrent handlers race.

Buggy:

var requestCount int

func handler(w http.ResponseWriter, r *http.Request) {
    requestCount++ // race
    fmt.Fprintf(w, "request #%d\n", requestCount)
}

Under -race, you'll see:

WARNING: DATA RACE
Read at 0x... by goroutine 12:
  main.handler.func1 ...
Previous write at 0x... by goroutine 9:
  main.handler.func1 ...

Fixed with atomic:

import "sync/atomic"

var requestCount atomic.Int64

func handler(w http.ResponseWriter, r *http.Request) {
    n := requestCount.Add(1)
    fmt.Fprintf(w, "request #%d\n", n)
}

Fixed with mutex:

import "sync"

var mu sync.Mutex
var requestCount int

func handler(w http.ResponseWriter, r *http.Request) {
    mu.Lock()
    requestCount++
    n := requestCount
    mu.Unlock()
    fmt.Fprintf(w, "request #%d\n", n)
}

If a handler only touches its w and r parameters and local variables, it's automatically safe across calls. The handler doesn't share anything to race on.

func safe(w http.ResponseWriter, r *http.Request) {
    n := time.Now().UnixNano()
    fmt.Fprintf(w, "now=%d\n", n)
}

No shared state → no race possible (other than races inside time.Now, which is safe).

Concurrency in practice¶

Most production handlers DO share state: a database handle, a cache, a configuration struct, a logger. These are passed around safely because: - *sql.DB is safe for concurrent use (internally pooled). - A *log.Logger is safe for concurrent use (internally locked). - Configuration is often read-only after init, so no synchronisation needed. - Caches use sync.Map or mutex-protected maps.

Each shared resource in your design must be evaluated: is it safe for concurrent use? If not, lock it.

Persistent connections (keep-alive)¶

HTTP/1.1's Connection: keep-alive (the default) lets one TCP connection carry many requests. After the response, the connection stays open; the next request arrives on the same conn.

Goroutine reuse¶

For keep-alive, the same per-connection goroutine handles all the requests on that conn:

goroutine N:
  read request 1
  call handler(w, r1)
  finish response 1
  read request 2          <-- same goroutine
  call handler(w, r2)     <-- same goroutine
  finish response 2
  ...

Sequentially within one conn, never concurrently.

Why this matters¶

If two handlers share a *sync.Mutex, contention between them only happens across different connections — not within one. So a single client hammering one connection sees zero lock contention. Many concurrent clients on different connections see real contention.

`IdleTimeout`¶

After the response, the goroutine reads for the next request. If no request arrives within IdleTimeout, the connection closes. The goroutine ends.

Default IdleTimeout is 0, which means "fall back to ReadTimeout." If ReadTimeout is also 0, connections idle indefinitely (subject to OS keepalive).

Recommendation: always set IdleTimeout (e.g., 60*time.Second) so idle clients don't pin goroutines forever.

Disabling keep-alive¶

srv.SetKeepAlivesEnabled(false)

After this, every response has Connection: close and the server closes the conn after responding. Use during shutdown to drain clients faster.

HTTP/2 in one paragraph¶

HTTP/2 is a binary, multiplexed protocol. One TCP connection carries many concurrent streams, each with its own request and response. The Go server (via golang.org/x/net/http2, bundled into net/http) implements HTTP/2 transparently when TLS is used:

One goroutine per TCP connection runs the framer (reads frames, decodes them).
Frames belong to streams; one goroutine per active stream runs the handler.
A separate goroutine serialises outgoing frames so they don't interleave on the wire.

So an HTTP/2 connection with 50 active streams is 1 framer goroutine + 1 writer goroutine + 50 handler goroutines = 52 goroutines for one conn.

For details, see senior.md. At the junior level: just know that HTTP/2 is transparent to your handler; the only handler-visible difference is that Hijack() doesn't work (HTTP/2 doesn't support raw conn takeover).

ResponseWriter is not safe for concurrent use¶

Quote the doc:

A ResponseWriter may not be used after the Handler.ServeHTTP method has returned.

And less explicitly but equally true:

A ResponseWriter may not be used concurrently from multiple goroutines.

This means: only your handler goroutine should call w.Write, w.WriteHeader, w.Header().Set, etc. Never spawn a goroutine that also calls these.

Why?¶

http.ResponseWriter is backed by per-connection buffers and headers. The server's state machine assumes one goroutine drives the response. Two goroutines writing simultaneously can: - Corrupt buffered bytes. - Race on WriteHeader (which can be called only once). - Race on header map updates. - Race on chunked-encoding state.

Race example¶

func bad(w http.ResponseWriter, r *http.Request) {
    go func() {
        w.Write([]byte("from goroutine\n")) // RACE
    }()
    w.Write([]byte("from main\n"))
}

go run -race:

WARNING: DATA RACE
Write at 0x... by goroutine 8:
  net/http.(*response).Write ...
Previous write at 0x... by goroutine 7:
  net/http.(*response).Write ...

Use after handler returns¶

func worse(w http.ResponseWriter, r *http.Request) {
    go func() {
        time.Sleep(time.Second)
        w.Write([]byte("late\n")) // use-after-return
    }()
}

By the time the goroutine writes, the handler has returned. The server has cleaned up the response (potentially closed the conn). Writing now is undefined.

Pattern: forward via channel¶

If you must do work in a goroutine, return results to the handler via a channel:

func okPattern(w http.ResponseWriter, r *http.Request) {
    out := make(chan string, 1)
    go func() {
        out <- doSlowWork(r.Context())
    }()
    select {
    case s := <-out:
        w.Write([]byte(s))
    case <-r.Context().Done():
        return
    }
}

The handler goroutine writes to w; the inner goroutine just produces a string. No race.

Request bodies and concurrency¶

r.Body is an io.ReadCloser. Reading it follows the standard io.Reader rules:

Read until EOF.
Read may block waiting for more bytes.
Read returns errors if connection drops.

Key concurrency rule: r.Body is owned by the handler's goroutine. Don't read it from another goroutine, and don't read it after the handler returns.

Reading the body¶

func upload(w http.ResponseWriter, r *http.Request) {
    body, err := io.ReadAll(r.Body)
    if err != nil { http.Error(w, err.Error(), 400); return }
    fmt.Fprintf(w, "got %d bytes\n", len(body))
}

Simple. The ReadAll blocks until EOF or error. The handler goroutine blocks too — that's fine, other goroutines aren't affected.

Closing the body¶

You don't need to close r.Body in handlers. The server does it for you after the handler returns. Closing early can be useful to release the connection sooner for keep-alive:

func handler(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)
    r.Body.Close() // optional: signals "I'm done with the body"
    // ...
}

But not closing is also fine.

Large bodies¶

io.ReadAll(r.Body) allocates a slice as large as the body. For a 10 GiB upload, that's 10 GiB of memory. Dangerous.

Cap the body size with http.MaxBytesReader:

r.Body = http.MaxBytesReader(w, r.Body, 1<<20) // cap at 1 MiB
body, err := io.ReadAll(r.Body)
if err != nil {
    // err is *http.MaxBytesError if exceeded
    http.Error(w, "body too large", http.StatusRequestEntityTooLarge)
    return
}

Or stream-decode (especially for JSON):

var req MyRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
    http.Error(w, err.Error(), 400)
    return
}

json.Decoder reads incrementally; doesn't buffer the whole body.

Reading body from multiple goroutines¶

Don't. r.Body is not safe for concurrent reads. Even if you think you're tagging reads with offsets — the underlying stream is sequential.

If you need to fan out the body to multiple consumers, read it once into memory (with size cap), then share the byte slice (immutable from there on):

body, _ := io.ReadAll(http.MaxBytesReader(w, r.Body, 1<<20))
// body is now an immutable []byte safe to share with goroutines
go process1(body)
go process2(body)

r.Context() and cancellation¶

r.Context() returns a context.Context for the request. It is cancelled when any of:

The client closes its connection.
(HTTP/2) The client sends RST_STREAM.
The handler returns.

So: - During the handler call, r.Context() is alive (not cancelled) unless the client has disconnected. - After the handler returns, r.Context() is always cancelled.

Why this matters¶

If your handler does long-running work — calling another service, querying a database, computing — and the client gives up, you want to abandon that work too. r.Context().Done() is how you find out.

Propagating to downstream calls¶

func proxy(w http.ResponseWriter, r *http.Request) {
    req, _ := http.NewRequestWithContext(r.Context(), "GET", "http://upstream/data", nil)
    resp, err := http.DefaultClient.Do(req)
    if err != nil { /* ctx cancelled? */ }
    defer resp.Body.Close()
    io.Copy(w, resp.Body)
}

If the client disconnects, r.Context() is cancelled. The downstream req inherits cancellation; the HTTP client aborts. The handler returns quickly.

Same for database calls — use *Context variants:

rows, err := db.QueryContext(r.Context(), "SELECT ...")

Selecting on Done()¶

For your own goroutines or channels:

func handler(w http.ResponseWriter, r *http.Request) {
    work := make(chan result, 1)
    go func() {
        work <- doWork()
    }()
    select {
    case res := <-work:
        // write res
    case <-r.Context().Done():
        return // client disconnected
    }
}

Ignoring r.Context() leads to leaks¶

func leaky(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)
    // ignore r.Context() entirely
    time.Sleep(time.Minute) // simulates slow work
    w.Write(body)
}

If the client disconnects, the server's read on r.Body may error (because the conn is closed), but time.Sleep(time.Minute) runs to completion. The goroutine is alive for a full minute, holding memory. With many disconnecting clients, this leaks.

Fix:

select {
case <-time.After(time.Minute):
    w.Write(body)
case <-r.Context().Done():
    return
}

Or use context.WithTimeout(r.Context(), ...) for a hard deadline.

Spawning goroutines from handlers¶

Sometimes you need to do something asynchronously: fan out to multiple services, write to a log queue, push a metric. Goroutines from a handler need three things:

They must NOT write to r.Body or w.
They must observe cancellation (typically via r.Context()).
They must not survive past the time they're needed.

Fan-out pattern¶

func fanout(w http.ResponseWriter, r *http.Request) {
    type result struct{ src, body string }
    out := make(chan result, 3)
    for _, src := range []string{"a", "b", "c"} {
        src := src // capture
        go func() {
            req, _ := http.NewRequestWithContext(r.Context(), "GET", "http://"+src, nil)
            resp, err := http.DefaultClient.Do(req)
            if err != nil { out <- result{src: src, body: "err: "+err.Error()}; return }
            defer resp.Body.Close()
            body, _ := io.ReadAll(resp.Body)
            out <- result{src: src, body: string(body)}
        }()
    }
    for i := 0; i < 3; i++ {
        select {
        case r := <-out:
            fmt.Fprintf(w, "%s: %s\n", r.src, r.body)
        case <-r.Context().Done():
            return
        }
    }
}

Each fan-out goroutine inherits r.Context() through NewRequestWithContext. If the client disconnects, all three downstream calls are cancelled. The main handler observes <-r.Context().Done() and returns.

Fire-and-forget pattern¶

Sometimes you want a goroutine to outlive the handler, e.g., push a metric to an external system:

func handler(w http.ResponseWriter, r *http.Request) {
    w.Write([]byte("ok"))

    // detach from request context
    go func() {
        ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
        defer cancel()
        pushMetric(ctx, "request_count", 1)
    }()
}

The goroutine uses context.Background (not r.Context()), so the request's cancellation doesn't kill it. It has its own 5-second budget. The handler returns immediately; the goroutine runs in the background.

Caveat. Fire-and-forget goroutines are not waited for by Server.Shutdown. On shutdown they may be killed mid-flight if the process exits before they finish. For mission-critical async work, use a real background-worker pattern with its own shutdown handling.

Race conditions in handlers¶

A race condition in a handler is no different from any other Go race. The race detector (-race) catches it.

Race on a shared variable¶

var lastClient string

func handler(w http.ResponseWriter, r *http.Request) {
    lastClient = r.RemoteAddr // race
    fmt.Fprintln(w, "ok")
}

Two concurrent clients write lastClient at the same time. Race.

Fix: atomic value, mutex, or scope per-request.

Race on a shared map¶

var visits = map[string]int{}

func handler(w http.ResponseWriter, r *http.Request) {
    visits[r.URL.Path]++ // race
}

map writes are not goroutine-safe in Go. This races immediately and may even crash with "concurrent map writes."

Fix:

var (
    mu     sync.Mutex
    visits = map[string]int{}
)

func handler(w http.ResponseWriter, r *http.Request) {
    mu.Lock()
    visits[r.URL.Path]++
    mu.Unlock()
}

Or sync.Map:

var visits sync.Map

func handler(w http.ResponseWriter, r *http.Request) {
    v, _ := visits.LoadOrStore(r.URL.Path, new(atomic.Int64))
    v.(*atomic.Int64).Add(1)
}

Race on a struct field¶

type Cache struct {
    data map[string]string
}

var cache = &Cache{data: map[string]string{}}

func handler(w http.ResponseWriter, r *http.Request) {
    cache.data[r.URL.Path] = "visited" // race
}

Same problem: shared mutable struct without synchronisation.

Fix: embed a sync.RWMutex in the struct:

type Cache struct {
    mu   sync.RWMutex
    data map[string]string
}

func (c *Cache) Set(k, v string) {
    c.mu.Lock()
    c.data[k] = v
    c.mu.Unlock()
}

func (c *Cache) Get(k string) (string, bool) {
    c.mu.RLock()
    v, ok := c.data[k]
    c.mu.RUnlock()
    return v, ok
}

Race on an interface variable¶

var logger *log.Logger // package-level

func setLogger(l *log.Logger) {
    logger = l // race if called concurrently with handlers
}

Pointer reassignment of *Logger is technically a race per the memory model. Fix:

var logger atomic.Pointer[log.Logger]

func setLogger(l *log.Logger) {
    logger.Store(l)
}

func handler(w http.ResponseWriter, r *http.Request) {
    logger.Load().Println("hit")
}

Or use sync.RWMutex.

Shared state across handlers¶

Most production servers have shared state. Common examples:

State	Concurrency tool
`*sql.DB`	None — internally safe
`*log.Logger`	None — internally safe
Configuration loaded at startup, never written	None — read-only after init
Cache (map of values)	`sync.RWMutex` or `sync.Map`
Counter, gauge	`atomic.Int64`
Rate limiter	`golang.org/x/time/rate.Limiter` — internally safe
Connection to upstream (gRPC, redis)	Usually safe by contract — check docs
Stateful business object	`sync.Mutex` (or rewrite to be immutable)

`*sql.DB` is safe¶

var db *sql.DB

func init() {
    db, _ = sql.Open("postgres", "...")
}

func handler(w http.ResponseWriter, r *http.Request) {
    rows, _ := db.QueryContext(r.Context(), "SELECT 1")
    // ...
}

Many concurrent handlers call db.QueryContext. *sql.DB is a connection pool; safe for concurrent use.

Configuration loaded at startup¶

If your config is loaded once at startup and never mutated:

var cfg Config

func init() {
    cfg = loadConfig()
}

func handler(w http.ResponseWriter, r *http.Request) {
    // read-only access to cfg, no locks needed
    fmt.Fprintf(w, "hello, %s", cfg.Greeting)
}

Read-only sharing is safe without synchronisation, as long as you can prove no goroutine ever writes after the writes-happen-before-reads boundary. init() runs before main() and before any handler, so this is fine.

Hot reload of config¶

If you want to swap the config at runtime:

var cfg atomic.Pointer[Config]

func init() {
    cfg.Store(loadConfig())
}

func reload() {
    cfg.Store(loadConfig())
}

func handler(w http.ResponseWriter, r *http.Request) {
    c := cfg.Load()
    fmt.Fprintf(w, "hello, %s", c.Greeting)
}

atomic.Pointer makes the pointer write/read atomic; readers always see a complete *Config.

The race detector¶

go run -race or go test -race instruments memory accesses. It reports any unsynchronised concurrent read/write to the same memory location.

Running¶

go run -race main.go

Slower (~5-10x), uses more memory. Use during development and testing, NOT production.

Sample output¶

==================
WARNING: DATA RACE
Write at 0x00c0000a8050 by goroutine 8:
  main.handler()
      /app/main.go:15 +0x4a
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2136 +0x47
...
Previous write at 0x00c0000a8050 by goroutine 7:
  main.handler()
      /app/main.go:15 +0x4a
...

It tells you the address, the writing goroutines, and the lines of code involved.

When the race detector misses things¶

Untaken code paths — the race only triggers if both racing accesses actually execute.
Subtle timing — sometimes a race is hard to reproduce; run tests many times.
Outside-Go code (cgo) — not instrumented.

So passing -race doesn't prove absence of races. But every race the detector catches is a real bug.

Running tests under -race¶

In your CI:

go test -race ./...

This should be a standard part of every Go project's pipeline.

Graceful shutdown for beginners¶

http.ListenAndServe blocks forever; how do you stop the server?

`Server.Shutdown(ctx)`¶

Use *http.Server directly:

package main

import (
    "context"
    "log"
    "net/http"
    "os"
    "os/signal"
    "syscall"
    "time"
)

func main() {
    srv := &http.Server{
        Addr:    ":8080",
        Handler: http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            time.Sleep(2 * time.Second)
            w.Write([]byte("ok"))
        }),
    }

    go func() {
        if err := srv.ListenAndServe(); err != http.ErrServerClosed {
            log.Fatal(err)
        }
    }()

    // wait for signal
    sigCh := make(chan os.Signal, 1)
    signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
    <-sigCh

    log.Println("shutting down...")
    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
    defer cancel()
    if err := srv.Shutdown(ctx); err != nil {
        log.Println("shutdown error:", err)
    }
}

What happens on Ctrl-C: 1. signal.Notify writes to sigCh. 2. <-sigCh returns. 3. srv.Shutdown(ctx) is called. 4. Server closes the listener (no new accepts). 5. Idle connections close immediately. 6. Active connections finish their current request, then close. 7. When all in-flight requests are done, Shutdown returns nil. 8. If 10 seconds pass first, Shutdown returns context.DeadlineExceeded.

Pitfall: `ListenAndServe` error¶

When Shutdown runs, ListenAndServe returns http.ErrServerClosed. That's not a real error; check for it:

if err := srv.ListenAndServe(); err != http.ErrServerClosed {
    log.Fatal(err)
}

Otherwise normal shutdown looks like a crash.

`Server.Close()` for emergencies¶

If Shutdown hangs (e.g., one handler blocking forever), call Close() to forcibly close all connections:

if err := srv.Shutdown(ctx); err != nil {
    log.Println("forced close")
    srv.Close()
}

Server timeouts in plain terms¶

Server timeouts control how long the server is willing to wait at various phases. Defaults are 0 (no timeout), which is dangerous in production.

Field	Plain English
`ReadHeaderTimeout`	How long the client gets to send the request headers.
`ReadTimeout`	How long the client gets to send the entire request (headers + body).
`WriteTimeout`	How long the server has to send the response.
`IdleTimeout`	How long to keep an idle keep-alive connection open.

Why timeouts exist¶

Without timeouts: - A slow attacker can keep thousands of connections open, sending headers one byte at a time, exhausting your server's file descriptors and goroutines (slowloris attack). - A misbehaving client can never finish reading a response, holding the conn forever.

Reasonable defaults¶

srv := &http.Server{
    Addr:              ":8080",
    ReadHeaderTimeout: 5 * time.Second,
    ReadTimeout:       30 * time.Second,
    WriteTimeout:      30 * time.Second,
    IdleTimeout:       60 * time.Second,
    Handler:           mux,
}

For upload-heavy services, raise ReadTimeout or set it to 0 (no overall body limit) but keep ReadHeaderTimeout short.

For long-poll/SSE, leave WriteTimeout at 0 and use r.Context().Done() inside the handler.

Mental models¶

Model 1 — The receptionist and the workers¶

The accept loop is a receptionist who greets every arriving guest and assigns each one a personal assistant (a goroutine). The assistant escorts the guest through the entire interaction: reads what they want, calls the appropriate department, brings back the result, says goodbye, escorts them out.

If a guest stays around for more business (keep-alive), the same assistant serves them again.

Many guests in the lobby = many assistants = many goroutines.

Model 2 — Email auto-responder¶

Imagine an inbox that auto-spawns a process per incoming email. Each process reads the email, replies, exits. Concurrent emails = concurrent processes. They don't see each other unless they touch a shared file (= shared state needing locks).

Model 3 — Restaurant kitchen¶

Many cooks (goroutines) take orders (requests) and prepare them. The kitchen (the server) has shared equipment (state). The cooks must coordinate use of the equipment (locks). Without coordination, two cooks reach for the same knife at the same time (race condition).

Model 4 — Mailbox per request¶

r.Context() is a mailbox the server can send a "cancel" message to. If your handler ignores the mailbox, you keep working after the client has hung up. Polite handlers check the mailbox periodically (via select with <-r.Context().Done()).

Coding patterns¶

Pattern 1 — Use `http.NewRequestWithContext` for upstream¶

req, err := http.NewRequestWithContext(r.Context(), "GET", url, nil)
if err != nil { ... }
resp, err := http.DefaultClient.Do(req)

Propagates cancellation.

Pattern 2 — Buffered channel for goroutine results¶

out := make(chan result, 1) // size 1 so producer never blocks
go func() { out <- doWork() }()
select {
case res := <-out:
case <-r.Context().Done():
    return
}

Avoids leaks if the producer outlasts the consumer.

Pattern 3 — Always set timeouts¶

srv := &http.Server{
    ReadHeaderTimeout: 5 * time.Second,
    WriteTimeout:      30 * time.Second,
    IdleTimeout:       60 * time.Second,
    Handler: mux,
}

Never use http.ListenAndServe for production.

Pattern 4 — Recover panics in middleware¶

func recoverer(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        defer func() {
            if err := recover(); err != nil {
                log.Printf("panic: %v\n%s", err, debug.Stack())
                http.Error(w, "internal error", 500)
            }
        }()
        next.ServeHTTP(w, r)
    })
}

The stdlib has its own panic recover at the conn level, but middleware lets you choose the response.

Pattern 5 — Sequential reads of request data¶

func handler(w http.ResponseWriter, r *http.Request) {
    body, err := io.ReadAll(io.LimitReader(r.Body, 1<<20))
    if err != nil { http.Error(w, err.Error(), 400); return }
    // process body, write response
    w.Write([]byte("ok"))
}

Read body in handler goroutine, never in spawned goroutines.

Clean code¶

Some clean-code rules specific to HTTP handlers.

Use named types for request/response shapes¶

Don't unpack JSON into map[string]any. Use struct types:

type CreateUserRequest struct {
    Name  string `json:"name"`
    Email string `json:"email"`
}

type CreateUserResponse struct {
    ID string `json:"id"`
}

func createUser(w http.ResponseWriter, r *http.Request) {
    var req CreateUserRequest
    if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
        http.Error(w, err.Error(), 400)
        return
    }
    // ...
    json.NewEncoder(w).Encode(CreateUserResponse{ID: id})
}

Separate routing from logic¶

// router.go
mux := http.NewServeMux()
mux.HandleFunc("/users", users.Create)
mux.HandleFunc("/users/", users.Get)

// users/handler.go
func Create(w http.ResponseWriter, r *http.Request) { ... }
func Get(w http.ResponseWriter, r *http.Request) { ... }

Reuse logic, not handlers¶

Don't share state through global handlers; share through services.

type UserService struct {
    db *sql.DB
}

func (s *UserService) Create(w http.ResponseWriter, r *http.Request) {
    // use s.db
}

func main() {
    db := openDB()
    us := &UserService{db: db}
    mux := http.NewServeMux()
    mux.HandleFunc("/users", us.Create)
}

Now UserService is testable in isolation.

Avoid `http.DefaultServeMux`¶

It's package-level mutable state. Multiple init() functions can register conflicting handlers. Use your own *http.ServeMux:

mux := http.NewServeMux()
mux.HandleFunc(...)
srv := &http.Server{Handler: mux}

Error handling¶

In handlers, errors fall into:

Client errors (4xx). Return with http.Error(w, msg, status) and stop.
Server errors (5xx). Log internally; return a generic message to the client.
Cancellation. r.Context().Err() is context.Canceled or context.DeadlineExceeded; usually just return without writing.

func handler(w http.ResponseWriter, r *http.Request) {
    if err := validate(r); err != nil {
        http.Error(w, err.Error(), 400)
        return
    }
    result, err := doWork(r.Context())
    if err != nil {
        if errors.Is(err, context.Canceled) {
            return // client gave up
        }
        log.Printf("doWork: %v", err)
        http.Error(w, "internal error", 500)
        return
    }
    json.NewEncoder(w).Encode(result)
}

Don't leak internal errors¶

Avoid:

http.Error(w, err.Error(), 500)  // leaks db connection strings, file paths, etc.

Better:

log.Printf("internal: %v", err)
http.Error(w, "internal error", 500)

Headers must be set before WriteHeader¶

w.Header().Set("Content-Type", "application/json")
w.Header().Set("X-Request-ID", reqID)
w.WriteHeader(http.StatusOK) // headers locked here
json.NewEncoder(w).Encode(result)

Once WriteHeader is called (explicitly or implicitly by the first Write), the headers are sent and can't be changed.

Security considerations¶

Slowloris¶

Attacker opens many connections, sends headers one byte per second. Without ReadHeaderTimeout, all those connections stay open. Set it:

ReadHeaderTimeout: 5 * time.Second,

Body size¶

Without limits, an attacker can POST 100 GiB. Use http.MaxBytesReader:

r.Body = http.MaxBytesReader(w, r.Body, 10<<20) // 10 MiB

Header size¶

MaxHeaderBytes defaults to 1 MiB. Tighten for stricter limits:

srv.MaxHeaderBytes = 16 << 10 // 16 KiB

Connection count¶

By default, unlimited. Combine with golang.org/x/net/netutil.LimitListener:

ln = netutil.LimitListener(ln, 10000)

TLS¶

Use srv.ListenAndServeTLS(certFile, keyFile) with TLS 1.2+:

srv.TLSConfig = &tls.Config{
    MinVersion: tls.VersionTLS12,
}

Don't trust client headers¶

X-Forwarded-For, User-Agent, etc. are attacker-controlled. Validate, sanitize, never reflect into HTML without escaping.

Performance tips¶

These are introductory; optimize.md has the deep dive.

Reuse buffers with `sync.Pool`¶

For hot-path allocations:

var pool = sync.Pool{New: func() any { return new(bytes.Buffer) }}

func handler(w http.ResponseWriter, r *http.Request) {
    buf := pool.Get().(*bytes.Buffer)
    buf.Reset()
    defer pool.Put(buf)
    fmt.Fprintf(buf, "hello %s", r.URL.Query().Get("name"))
    w.Write(buf.Bytes())
}

Avoid `fmt.Sprintf` in hot paths¶

// slow
msg := fmt.Sprintf("hello %s", name)

// faster
msg := "hello " + name

For more complex formatting, use strconv.AppendInt/AppendFloat to build slices without allocation.

Don't `ReadAll` if you can stream¶

// allocates whole body
body, _ := io.ReadAll(r.Body)
json.Unmarshal(body, &req)

// streams
json.NewDecoder(r.Body).Decode(&req)

Compress responses¶

import "compress/gzip"

func handler(w http.ResponseWriter, r *http.Request) {
    if strings.Contains(r.Header.Get("Accept-Encoding"), "gzip") {
        w.Header().Set("Content-Encoding", "gzip")
        gz := gzip.NewWriter(w)
        defer gz.Close()
        gz.Write(largeBody)
        return
    }
    w.Write(largeBody)
}

Best practices¶

Always use *http.Server, never http.ListenAndServe directly for production.
Set ReadHeaderTimeout, ReadTimeout, WriteTimeout, IdleTimeout.
Pass r.Context() to every downstream call.
Don't share http.ResponseWriter or r.Body with other goroutines.
Handle http.ErrServerClosed as the success case.
Recover panics in middleware.
Use -race in tests.
Set MaxBytesReader on every body read.
Don't expose net/http/pprof on a public port.
Use structured logging (log/slog).

Edge cases and pitfalls¶

"It works on localhost but breaks in production"¶

Localhost has zero latency, no packet loss. Production has both. Symptoms: - Race conditions surface (timing changes). - Timeouts fire (server slower under real network). - Goroutine leaks (slow clients pin goroutines longer).

Test with artificial latency (tc qdisc on Linux, Toxiproxy) before deploying.

"My response is truncated"¶

Likely causes: - WriteTimeout fired before the response finished. - Handler panicked and the recovery sent partial output. - Client closed the connection mid-response. - You called WriteHeader twice (second call ignored, but logged).

"Some requests are very slow randomly"¶

Could be: - GC pauses (large heap, frequent allocation). - Lock contention on a shared resource. - Connection limit reached (kernel SYN queue overflow). - Database slowness with no query timeout.

pprof and runtime/trace are the diagnostic tools.

"Goroutines keep climbing"¶

Classic leak. See find-bug.md and professional.md for diagnosis.

"I get `EOF` reading the body sometimes"¶

Client disconnected before sending the full body. Check r.Context().Err() to confirm.

Common mistakes¶

Spawning a goroutine to write to w. Race + use-after-return.
Using context.Background() for upstream calls inside handlers. Loses cancellation.
Not setting timeouts on *http.Server. Slowloris vulnerability.
Calling WriteHeader after Write. First Write implicitly calls WriteHeader(200); later explicit call is ignored.
Reading r.Body after handler returns. Use-after-free.
fmt.Sprintf in hot paths. Allocates; slow at scale.
Mutating r from handlers. Don't; pass derived data instead.
Closing w or r.Body and continuing to use them. Don't.

Common misconceptions¶

"Handlers run on a thread pool."

No. Each connection gets a fresh goroutine. Goroutines are not pooled by net/http.

"r.Context() cancels when the handler returns."

True, but more importantly, it cancels when the client disconnects mid-handler.

"Server.Shutdown returns when the listener closes."

No. It returns when the listener is closed AND all in-flight requests have finished AND all idle conns are closed.

"I can write to w from any goroutine as long as I take a mutex."

Technically you can, but it's still a use-after-return bug if your goroutine writes after the handler returns. Don't do it.

"HTTP/1.1 keep-alive sends requests in parallel."

No — keep-alive is sequential request/response pairs on one connection. Parallelism comes from multiple connections or HTTP/2.

"http.ListenAndServe is single-threaded."

No — it serves concurrent connections concurrently. The accept loop is one goroutine, each connection is another, and so on.

Tricky points¶

The body is consumed when you don't read it¶

If your handler doesn't fully read r.Body, the server reads-and-discards remaining bytes after the handler returns (so the next request can be parsed). This can take noticeable time for large unread bodies.

If you don't want this, set r.Body = http.MaxBytesReader(w, r.Body, 1) early to truncate, or close the connection with Connection: close.

`Flusher` and chunked transfer¶

If you want to stream a response (e.g., SSE), use Flusher:

func sse(w http.ResponseWriter, r *http.Request) {
    fl, ok := w.(http.Flusher)
    if !ok { http.Error(w, "no streaming", 500); return }
    w.Header().Set("Content-Type", "text/event-stream")
    for i := 0; i < 10; i++ {
        fmt.Fprintf(w, "data: %d\n\n", i)
        fl.Flush() // pushes to client immediately
        time.Sleep(time.Second)
        if r.Context().Err() != nil { return }
    }
}

Without Flush(), the response sits in the server's write buffer until the handler returns.

`Connection: Upgrade` doesn't go through your handler in HTTP/2¶

WebSockets require HTTP/1.1 (with Hijack()). On HTTP/2, the upgrade is invalid. If your handler does Hijack() and the connection is HTTP/2, you get http.ErrNotSupported.

`http.HandlerFunc(fn)` is just a type cast¶

type HandlerFunc func(ResponseWriter, *Request)
func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) { f(w, r) }

No goroutines involved. It's just a way to make a function value satisfy http.Handler.

Test¶

Write a simple test for a handler:

package main

import (
    "io"
    "net/http"
    "net/http/httptest"
    "testing"
)

func TestHandler(t *testing.T) {
    req := httptest.NewRequest("GET", "/?name=Bob", nil)
    rec := httptest.NewRecorder()
    handler(rec, req)
    resp := rec.Result()
    body, _ := io.ReadAll(resp.Body)
    if string(body) != "hello, Bob\n" {
        t.Errorf("got %q", body)
    }
}

httptest.NewRecorder is a fake http.ResponseWriter. httptest.NewRequest is a fake *http.Request. No goroutines or sockets needed.

For end-to-end:

func TestServer(t *testing.T) {
    srv := httptest.NewServer(http.HandlerFunc(handler))
    defer srv.Close()

    resp, _ := http.Get(srv.URL + "/?name=Bob")
    body, _ := io.ReadAll(resp.Body)
    resp.Body.Close()
    if string(body) != "hello, Bob\n" {
        t.Errorf("got %q", body)
    }
}

httptest.NewServer starts a real server on a random port. srv.Close() shuts it down.

Run with -race:

go test -race ./...

Tricky questions¶

Q1. Inside a handler, you call `go someFunc(r.Context())`. The handler then returns. Is the goroutine still alive?¶

A. Yes, the goroutine is alive, but its context is cancelled. someFunc should observe <-ctx.Done() and exit quickly. If it doesn't, leak.

Q2. You call `w.Write` and then `w.WriteHeader(404)`. What status does the client see?¶

A. 200. The first Write implicitly set status to 200. The later WriteHeader(404) is ignored (and logged).

Q3. The server is in `Shutdown` and `Shutdown` is blocking. What causes it to unblock?¶

A. Either: - All active connections become idle (their handlers return), then they close → Shutdown returns nil. - The ctx passed to Shutdown is cancelled → returns ctx.Err().

Q4. Why is there no `Server.MaxHandlers` field?¶

A. There isn't one. You implement concurrency limits with golang.org/x/net/netutil.LimitListener (connection-level), middleware (request-level), or http2.Server.MaxConcurrentStreams (HTTP/2 streams).

Q5. Two connections, each running a handler. The handlers share a `*sync.Mutex`. Is this safe?¶

A. Yes, that's exactly what sync.Mutex is for. The two handlers contend on the lock; one waits for the other.

Q6. After `Hijack()`, who reads from the conn?¶

A. Your code. The server is out of the picture.

Q7. What's the difference between `http.Handler` and `http.HandlerFunc`?¶

A. Handler is an interface (ServeHTTP(w, r)). HandlerFunc is a concrete function type that adapts a plain function to satisfy the interface.

Cheat sheet¶

// Build server
srv := &http.Server{
    Addr:              ":8080",
    Handler:           mux,
    ReadHeaderTimeout: 5 * time.Second,
    ReadTimeout:       30 * time.Second,
    WriteTimeout:      30 * time.Second,
    IdleTimeout:       60 * time.Second,
    MaxHeaderBytes:    1 << 16,
}

// Run + graceful shutdown
go func() {
    if err := srv.ListenAndServe(); err != http.ErrServerClosed {
        log.Fatal(err)
    }
}()
sig := make(chan os.Signal, 1)
signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)
<-sig
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
srv.Shutdown(ctx)

// Handler with cancellation
func h(w http.ResponseWriter, r *http.Request) {
    req, _ := http.NewRequestWithContext(r.Context(), "GET", url, nil)
    resp, err := http.DefaultClient.Do(req)
    if err != nil { http.Error(w, err.Error(), 502); return }
    defer resp.Body.Close()
    io.Copy(w, resp.Body)
}

// Concurrency-safe shared map
var (
    mu sync.RWMutex
    m  = map[string]string{}
)

// Body size limit
r.Body = http.MaxBytesReader(w, r.Body, 1<<20)

// Race detector
go test -race ./...

Self-assessment checklist¶

Summary¶

The Go net/http server uses a goroutine-per-connection model. Each accepted TCP connection becomes one new goroutine; that goroutine reads requests, calls your handler, and writes responses, one at a time per HTTP/1.1 (many concurrent streams per HTTP/2). Your handler runs synchronously inside that goroutine; concurrency comes from many such goroutines running in parallel.

Your responsibilities as a handler author: 1. Don't touch http.ResponseWriter from other goroutines or after returning. 2. Propagate r.Context() to anything that takes a context. 3. Don't read r.Body outside the handler call. 4. Protect any state shared across handler instances with sync.Mutex, sync/atomic, or immutability. 5. Set server timeouts so a slow client can't hold a goroutine forever. 6. Handle graceful shutdown via Server.Shutdown.

The race detector (go test -race) is your safety net for concurrent-access bugs. Use it in CI. pprof is your tool for diagnosing leaks. Production servers configure timeouts, structured logging, signal handling, and connection limits.

What you can build¶

After this file you can build: - A REST API with concurrent clients. - A graceful-shutdown server with SIGINT handling. - A handler that fans out to multiple upstream services with cancellation. - A streaming response handler using Flusher. - A simple in-memory cache backed by a mutex-protected map. - A panic-recovery middleware. - A request-logging middleware.

You should NOT yet attempt: - A WebSocket server (needs Hijack() knowledge). - A custom rate limiter at the connection level (needs netutil.LimitListener understanding). - A high-throughput production server (read middle.md, senior.md, professional.md first).

Diagrams and Visual Aids¶

+------------------+        Accept        +------------------+
|  net.Listener    | -------------------> |   *conn (state)  |
+------------------+                      +---------+--------+
                                                    |
                                                    | go c.serve(ctx)
                                                    v
                                          +--------------------+
                                          | Per-conn goroutine |
                                          |  - reads requests  |
                                          |  - calls handler   |
                                          |  - writes responses|
                                          +--------------------+

Connection lifecycle (HTTP/1.1 with keep-alive)

  Accept -> spawn goroutine -> TLS handshake (if TLS)
                              \
                               +-> readRequest
                               |       |
                               |       v
                               |   buildRequest
                               |       |
                               |       v
                               |   handler.ServeHTTP(w, r)
                               |       |
                               |       v
                               |   finishRequest
                               |       |
                               |       v
                               |   if keepalive: loop
                               |   else: break
                               v
                            close conn -> goroutine ends

Concurrent connections, separate goroutines

   client A -> conn A -> goroutine G_A -> handler running
   client B -> conn B -> goroutine G_B -> handler running    <-- parallel
   client C -> conn C -> goroutine G_C -> handler running
   ...

   shared state needs locks/atomics
   per-handler state is local, no sharing

r.Context() cancellation flow

  client closes conn ----------+
                                \
  HTTP/2 RST_STREAM -----------+--> cancelCtx() called
                                /
  handler returns -------------+
                                \
                                 v
                          r.Context().Done() closed
                                |
                                v
                     downstream calls observe cancellation
                          (db queries, upstream HTTP)

Extended worked examples¶

This section walks through five small but complete programs that illustrate the concurrency points discussed above. Each is runnable and small enough to read in one sitting. If you can write each of these without looking, you have absorbed the chapter.

Example 1 — Counting connections vs counting requests¶

We expose two counters: total accepted connections and total HTTP requests. They differ because of keep-alive.

package main

import (
    "fmt"
    "net"
    "net/http"
    "sync/atomic"
    "time"
)

var (
    connCount atomic.Int64
    reqCount  atomic.Int64
)

type countingListener struct {
    net.Listener
}

func (l *countingListener) Accept() (net.Conn, error) {
    c, err := l.Listener.Accept()
    if err == nil {
        connCount.Add(1)
    }
    return c, err
}

func handler(w http.ResponseWriter, r *http.Request) {
    reqCount.Add(1)
    fmt.Fprintf(w, "conn=%d req=%d\n", connCount.Load(), reqCount.Load())
}

func main() {
    ln, _ := net.Listen("tcp", ":8080")
    ln = &countingListener{Listener: ln}
    srv := &http.Server{
        Handler:           http.HandlerFunc(handler),
        ReadHeaderTimeout: 5 * time.Second,
        IdleTimeout:       30 * time.Second,
    }
    srv.Serve(ln)
}

Test:

# new connection per request
$ for i in 1 2 3; do curl http://localhost:8080/; done
conn=1 req=1
conn=2 req=2
conn=3 req=3

# keep-alive: one connection, many requests
$ curl --keepalive-time 60 http://localhost:8080/ http://localhost:8080/ http://localhost:8080/
conn=4 req=4
conn=4 req=5
conn=4 req=6

The lesson: in HTTP/1.1, request count grows with each request; connection count grows only when a new TCP connection is opened.

Example 2 — A handler that observes client disconnect¶

This handler does slow work and is correctly cancellable.

package main

import (
    "fmt"
    "net/http"
    "time"
)

func slow(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    for i := 0; i < 10; i++ {
        select {
        case <-time.After(time.Second):
            fmt.Fprintf(w, "tick %d\n", i)
            if f, ok := w.(http.Flusher); ok {
                f.Flush()
            }
        case <-ctx.Done():
            // client disconnected; clean up
            fmt.Println("client gave up at tick", i)
            return
        }
    }
    fmt.Fprintln(w, "done")
}

func main() {
    http.HandleFunc("/slow", slow)
    http.ListenAndServe(":8080", nil)
}

Test:

$ curl http://localhost:8080/slow
tick 0
tick 1
...

Press Ctrl-C mid-stream. The server prints "client gave up at tick N" and the handler returns. No leak.

If the handler ignored ctx.Done(), it would keep running for all 10 seconds regardless.

Example 3 — Shared cache with `sync.RWMutex`¶

A simple in-memory cache shared across all handlers.

package main

import (
    "fmt"
    "net/http"
    "strings"
    "sync"
)

type Cache struct {
    mu   sync.RWMutex
    data map[string]string
}

func (c *Cache) Get(k string) (string, bool) {
    c.mu.RLock()
    defer c.mu.RUnlock()
    v, ok := c.data[k]
    return v, ok
}

func (c *Cache) Set(k, v string) {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.data[k] = v
}

var cache = &Cache{data: map[string]string{}}

func handler(w http.ResponseWriter, r *http.Request) {
    k := strings.TrimPrefix(r.URL.Path, "/")
    switch r.Method {
    case http.MethodGet:
        if v, ok := cache.Get(k); ok {
            fmt.Fprintln(w, v)
        } else {
            http.NotFound(w, r)
        }
    case http.MethodPut:
        var sb strings.Builder
        // limit body to 1 MiB; in real code use http.MaxBytesReader
        if _, err := sb.ReadFrom(r.Body); err != nil {
            http.Error(w, err.Error(), 400)
            return
        }
        cache.Set(k, sb.String())
        w.WriteHeader(http.StatusCreated)
    default:
        http.Error(w, "method not allowed", 405)
    }
}

func main() {
    http.HandleFunc("/", handler)
    http.ListenAndServe(":8080", nil)
}

RWMutex lets many readers proceed in parallel; writers exclude all readers. Under -race, this should be clean.

Example 4 — Fan-out to multiple services with `errgroup`¶

golang.org/x/sync/errgroup is a small helper for "wait for N goroutines, propagate the first error, share cancellation."

package main

import (
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "sync"

    "golang.org/x/sync/errgroup"
)

func fanout(w http.ResponseWriter, r *http.Request) {
    g, ctx := errgroup.WithContext(r.Context())
    var mu sync.Mutex
    results := map[string]string{}

    sources := []string{
        "https://httpbin.org/get",
        "https://httpbin.org/uuid",
        "https://httpbin.org/headers",
    }

    for _, src := range sources {
        src := src
        g.Go(func() error {
            req, _ := http.NewRequestWithContext(ctx, "GET", src, nil)
            resp, err := http.DefaultClient.Do(req)
            if err != nil { return err }
            defer resp.Body.Close()
            body, err := io.ReadAll(resp.Body)
            if err != nil { return err }
            mu.Lock()
            results[src] = string(body)
            mu.Unlock()
            return nil
        })
    }

    if err := g.Wait(); err != nil {
        http.Error(w, err.Error(), 502)
        return
    }
    json.NewEncoder(w).Encode(results)
}

func main() {
    http.HandleFunc("/fan", fanout)
    http.ListenAndServe(":8080", nil)
}

Three concurrent requests share r.Context(). If any fails or the client disconnects, the group's context cancels, and the other in-flight requests abort. The mutex protects the result map.

Example 5 — Graceful shutdown with background workers¶

package main

import (
    "context"
    "log"
    "net/http"
    "os"
    "os/signal"
    "sync"
    "syscall"
    "time"
)

type Worker struct {
    ch   chan string
    done chan struct{}
}

func (w *Worker) Run() {
    defer close(w.done)
    for msg := range w.ch {
        log.Println("processing:", msg)
        time.Sleep(time.Second) // simulate work
    }
}

func (w *Worker) Stop() {
    close(w.ch)
    <-w.done
}

func main() {
    worker := &Worker{ch: make(chan string, 100), done: make(chan struct{})}
    go worker.Run()

    srv := &http.Server{
        Addr: ":8080",
        Handler: http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            select {
            case worker.ch <- r.URL.Path:
                w.Write([]byte("queued"))
            case <-r.Context().Done():
                return
            }
        }),
    }

    var wg sync.WaitGroup
    wg.Add(1)
    go func() {
        defer wg.Done()
        if err := srv.ListenAndServe(); err != http.ErrServerClosed {
            log.Fatal(err)
        }
    }()

    sigCh := make(chan os.Signal, 1)
    signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
    <-sigCh
    log.Println("shutdown initiated")

    // 1. Stop accepting HTTP.
    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
    defer cancel()
    if err := srv.Shutdown(ctx); err != nil {
        log.Println("shutdown:", err)
    }

    // 2. Stop the worker (after all handlers finished enqueueing).
    worker.Stop()

    wg.Wait()
    log.Println("clean exit")
}

The order is critical: 1. Stop HTTP first so no new items are enqueued. 2. Stop the worker, which drains its queue.

If you stopped the worker first, the still-running handlers would deadlock trying to send into a closed channel.

Concurrency anti-patterns specific to handlers¶

Anti-pattern 1 — `time.AfterFunc` writing to `w`¶

func bad(w http.ResponseWriter, r *http.Request) {
    time.AfterFunc(time.Second, func() {
        w.Write([]byte("late")) // race + use-after-return
    })
    w.Write([]byte("now"))
}

time.AfterFunc runs its function on a goroutine spawned by the runtime. By the time it fires, the handler has returned. Don't write to w from it.

Anti-pattern 2 — Storing `*Request` in a queue¶

var queue = make(chan *http.Request, 100)

func handler(w http.ResponseWriter, r *http.Request) {
    queue <- r
    w.Write([]byte("queued"))
}

func processQueue() {
    for r := range queue {
        // r.Body is closed/reused by now!
        body, _ := io.ReadAll(r.Body)
        // garbage or panic
    }
}

After the handler returns, r.Body is no longer valid. Extract what you need inside the handler, then enqueue the extracted data:

type job struct {
    path string
    body []byte
    headers http.Header
}

var queue = make(chan job, 100)

func handler(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(http.MaxBytesReader(w, r.Body, 1<<20))
    select {
    case queue <- job{path: r.URL.Path, body: body, headers: r.Header.Clone()}:
        w.Write([]byte("queued"))
    case <-r.Context().Done():
        return
    }
}

Anti-pattern 3 — Sleeping while holding a global lock¶

var mu sync.Mutex

func handler(w http.ResponseWriter, r *http.Request) {
    mu.Lock()
    defer mu.Unlock()
    time.Sleep(time.Second) // blocks ALL other handlers
    w.Write([]byte("ok"))
}

Every other handler waiting on mu is also blocked. Hold the lock only for the critical section.

Anti-pattern 4 — Calling `http.Get` from a handler without context¶

func proxy(w http.ResponseWriter, r *http.Request) {
    resp, _ := http.Get("http://upstream/data") // uses context.Background()
    defer resp.Body.Close()
    io.Copy(w, resp.Body)
}

If upstream hangs, this handler hangs forever. Use NewRequestWithContext.

Anti-pattern 5 — Closing channels from many goroutines¶

var ch = make(chan int)

func handler(w http.ResponseWriter, r *http.Request) {
    ch <- 42
    close(ch) // many handlers will panic on second close
}

Channels must be closed exactly once. From handler code, closing a shared channel is almost always wrong. The owning goroutine (often the receiver) should close.

Anti-pattern 6 — Mutating `r.Header` from middleware after handler started¶

func middleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        next.ServeHTTP(w, r)
        r.Header.Set("X-Logged", "true") // weird; harmless but pointless
    })
}

After next.ServeHTTP returns, modifying r.Header does nothing observable. Worse, if next spawned a goroutine still reading headers, you have a race. Don't mutate.

Anti-pattern 7 — Goroutine pools "to limit concurrency"¶

var pool = make(chan struct{}, 100)

func handler(w http.ResponseWriter, r *http.Request) {
    pool <- struct{}{}
    defer func() { <-pool }()
    // ...
}

This blocks the handler's goroutine waiting for the pool. The server already spawned the goroutine; you didn't save anything. The goroutine still exists, holding a stack and *Request. To actually limit concurrency, return early (503) when full:

select {
case pool <- struct{}{}:
    defer func() { <-pool }()
    // do work
default:
    http.Error(w, "too busy", 503)
    return
}

Diving slightly deeper: how the accept loop actually iterates¶

The real accept loop is in net/http/server.go around line 3260. Simplified:

func (srv *Server) Serve(l net.Listener) error {
    var tempDelay time.Duration
    ctx := context.WithValue(baseCtx, ServerContextKey, srv)
    for {
        rw, err := l.Accept()
        if err != nil {
            select {
            case <-srv.getDoneChan():
                return ErrServerClosed
            default:
            }
            if ne, ok := err.(net.Error); ok && ne.Temporary() {
                if tempDelay == 0 { tempDelay = 5 * time.Millisecond } else { tempDelay *= 2 }
                if tempDelay > time.Second { tempDelay = time.Second }
                srv.logf("Accept error: %v; retrying in %v", err, tempDelay)
                time.Sleep(tempDelay)
                continue
            }
            return err
        }
        tempDelay = 0
        c := srv.newConn(rw)
        c.setState(c.rwc, StateNew, runHooks)
        go c.serve(ctx)
    }
}

Junior takeaways: - Accept returns an error if the listener is closed; the server checks doneChan to distinguish shutdown from a real error. - On a temporary error (e.g., "too many open files"), the loop backs off exponentially (5ms, 10ms, 20ms, ..., capped at 1s) rather than spinning. - setState and newConn are bookkeeping; go c.serve(ctx) is the actual goroutine spawn.

Don't worry about the details. The pattern is "Accept → spawn goroutine → repeat."

When NOT to use `net/http`¶

net/http is the right answer for 95% of HTTP services. But there are edge cases:

Extreme throughput (millions of req/s on a single box). fasthttp (github.com/valyala/fasthttp) gives ~2x throughput by reusing request/response structs and not following the net/http API contracts. Trade-off: incompatible API; some things harder.
HTTP/3 (QUIC). Use quic-go; net/http only supports h1/h2.
Very specific TLS needs. net/http exposes most via tls.Config; for unusual cases (custom ALPN handlers, exotic ciphers), you may want a lower-level TLS layer.
Embedded systems. A toy server with net.Conn and hand-written parsing is smaller.

For a normal API server, web service, or proxy, net/http is correct.

A note on `http.ServeMux`¶

http.ServeMux is a simple router. It's safe for concurrent use of registered handlers, but registration is not (you should register all handlers at startup, before serving).

mux := http.NewServeMux()
mux.HandleFunc("/", handler)        // register at startup
mux.HandleFunc("/users", users)
// don't call HandleFunc after Server starts!

srv := &http.Server{Addr: ":8080", Handler: mux}
go srv.ListenAndServe()

For richer routing (path params, regex), use chi, gorilla/mux, or in Go 1.22+ the enhanced net/http route patterns:

mux.HandleFunc("GET /users/{id}", getUser) // Go 1.22+

This doesn't change concurrency; routing happens per request and dispatches to your handler the same way.

Recap¶

You now know: - Goroutine-per-connection: one goroutine per accepted TCP conn, handling many requests (HTTP/1.1 keep-alive). - http.ResponseWriter and r.Body are bound to the handler goroutine; never share with spawned goroutines. - r.Context() cancels when the client disconnects or the handler returns. - Goroutines from handlers must observe cancellation and not write to w after return. - Shared state across handlers needs sync.Mutex/sync.RWMutex/sync/atomic or immutability. - Server timeouts (ReadHeaderTimeout, ReadTimeout, WriteTimeout, IdleTimeout) are mandatory in production. - Server.Shutdown(ctx) for graceful shutdown; Server.Close() for forceful. - go test -race catches concurrent-access bugs.

The next file, middle.md, opens up net/http/server.go and walks through (*conn).serve line by line. You'll see exactly which goroutines the server creates, how the active-connection map works, and the precise sequence of state transitions.

net/http Server Concurrency — Junior Level¶

Table of Contents¶

Introduction¶

Prerequisites¶

Glossary¶

Core Concepts¶

One connection = one goroutine¶

The handler is synchronous from your perspective¶

Concurrency comes from other goroutines, not your handler¶

The server starts goroutines, never destroys them on its own¶

Your first server¶

What the server did under the hood¶

What is a goroutine-per-connection model¶

One: a goroutine is created per accepted connection¶

Two: that goroutine owns the connection¶

Three: the goroutine ends when the connection ends¶

Four: there is no maximum¶

Five: HTTP/2 changes the rule¶

Compare: thread-per-connection vs goroutine-per-connection¶

Lifecycle of one HTTP request¶

Step 1: Listener accept¶

Step 2: Spawn the per-connection goroutine¶

Step 3: TLS handshake (if HTTPS)¶

Step 4: Read the request¶

Step 5: Build ResponseWriter and Request¶

Step 6: Start the cancel-context goroutine¶

Step 7: Call your handler¶

Step 8: Finish the response¶

Step 9: Keep-alive decision¶

Step 10: Connection close (eventually)¶

Handler runs on its own goroutine¶

You can use blocking operations freely¶

You can also block forever (bug!)¶

What goroutine ID do handlers run on?¶

Many requests at the same time¶

Shared state must be protected¶

What if you don't share state?¶

Concurrency in practice¶

Persistent connections (keep-alive)¶

Goroutine reuse¶

Why this matters¶

IdleTimeout¶

Disabling keep-alive¶

HTTP/2 in one paragraph¶

ResponseWriter is not safe for concurrent use¶

Why?¶

Race example¶

Use after handler returns¶

Pattern: forward via channel¶

Request bodies and concurrency¶

Reading the body¶

Closing the body¶

Large bodies¶

Reading body from multiple goroutines¶

r.Context() and cancellation¶

Why this matters¶

Propagating to downstream calls¶

Selecting on Done()¶

Ignoring r.Context() leads to leaks¶

Spawning goroutines from handlers¶

Fan-out pattern¶

Fire-and-forget pattern¶

Race conditions in handlers¶

Race on a shared variable¶

Race on a shared map¶

Race on a struct field¶

Race on an interface variable¶

Shared state across handlers¶

*sql.DB is safe¶

Configuration loaded at startup¶

Hot reload of config¶

The race detector¶

Running¶

Sample output¶

When the race detector misses things¶

Running tests under -race¶

Graceful shutdown for beginners¶

Server.Shutdown(ctx)¶

Pitfall: ListenAndServe error¶

Server.Close() for emergencies¶

`IdleTimeout`¶

`*sql.DB` is safe¶

`Server.Shutdown(ctx)`¶

Pitfall: `ListenAndServe` error¶

`Server.Close()` for emergencies¶

Pattern 1 — Use `http.NewRequestWithContext` for upstream¶

Avoid `http.DefaultServeMux`¶

Reuse buffers with `sync.Pool`¶

Avoid `fmt.Sprintf` in hot paths¶

Don't `ReadAll` if you can stream¶

"I get `EOF` reading the body sometimes"¶

`Flusher` and chunked transfer¶

`Connection: Upgrade` doesn't go through your handler in HTTP/2¶

`http.HandlerFunc(fn)` is just a type cast¶

Q1. Inside a handler, you call `go someFunc(r.Context())`. The handler then returns. Is the goroutine still alive?¶

Q2. You call `w.Write` and then `w.WriteHeader(404)`. What status does the client see?¶

Q3. The server is in `Shutdown` and `Shutdown` is blocking. What causes it to unblock?¶

Q4. Why is there no `Server.MaxHandlers` field?¶

Q5. Two connections, each running a handler. The handlers share a `*sync.Mutex`. Is this safe?¶

Q6. After `Hijack()`, who reads from the conn?¶

Q7. What's the difference between `http.Handler` and `http.HandlerFunc`?¶

Example 3 — Shared cache with `sync.RWMutex`¶

Example 4 — Fan-out to multiple services with `errgroup`¶