8.6 bufio — Senior¶
Audience. You've shipped Go services that depend on
bufioand you've been bitten by at least one of: a forgottenFlush, anErrTooLongthat ate a token, or aReadSlicethat handed you stale bytes. This file is the precise contract: what each method guarantees, what it does not, and the systems-level details that separate "passes the test" from "behaves under load."
Prerequisites: middle.md and the Reader/Writer/Closer contracts in ../01-io-and-file-handling/senior.md.
1. The bufio.Reader invariants¶
The internal state of a bufio.Reader is, simplified:
type Reader struct {
buf []byte
rd io.Reader
r, w int // buf[r:w] is unread data
err error // sticky read error
lastByte int // for UnreadByte
lastRune int // for UnreadRune
}
0 <= r <= w <= len(buf).buf[r:w]is the bytes that have been read fromrdbut not yet consumed by the caller.erris sticky: once set,bufio.Readerreturns it from every read untilResetclears it.r == wmeans the buffer is empty and the next user read triggers a fill fromrd.
Implications:
-
Buffered bytes are gone from
rd's view. Oncebufio.Readerhas calledrd.Readand pulled bytes intobuf, those bytes are the bufio reader's; reading directly fromrdskips them. -
Sticky errors block further reads. A transient error from
rdbecomes permanent for thebufio.Reader. If you want to recover,Reset(rd)(orReset(other)) clears the error — but you may also lose buffered bytes if they belong to before the error. -
UnreadByteandUnreadRuneonly work once. They push the most recent successfulReadByte/ReadRuneback. A secondUnreadBytewithout an interveningReadBytereturnsbufio.ErrInvalidUnreadByte.
2. Why Read may return less than the buffer holds¶
(*bufio.Reader).Read(p []byte) does not always fill p. The implementation:
- If
buf[r:w]is non-empty, copy up tolen(p)bytes from there intopand return. - Otherwise, if
len(p) >= len(buf), bypass the buffer and callrd.Read(p)directly. Return whatever that produced. - Otherwise, fill
bufby callingrd.Read(buf), then copy frombufintop.
In case 2 (large reads), the bufio buffer is bypassed. This is the right behaviour — buffering only helps when the user reads in small chunks. A 1 MiB Read call gets one underlying Read and copies zero bytes through bufio's buffer.
In case 1 (buffered data), you get whatever happens to be there — possibly less than len(p). That's fine if the caller is io.Copy or any other helper that loops.
3. The ErrNoProgress watchdog¶
If an underlying io.Reader returns (0, nil) repeatedly, bufio.Reader gives up after 100 such calls and returns io.ErrNoProgress. This is a defence against buggy readers that would otherwise cause bufio to spin forever.
// Inside bufio.Reader.fill (paraphrased):
for i := maxConsecutiveEmptyReads; i > 0; i-- {
n, err := b.rd.Read(b.buf[b.w:])
b.w += n
if err != nil { b.err = err; return }
if n > 0 { return }
}
b.err = io.ErrNoProgress
You should never see this in production code with well-behaved readers. If you do, the underlying reader is broken — file a bug or fix the producer.
4. ReadSlice and the buffer-full case¶
ReadSlice(delim) returns a slice from the internal buffer, up to and including delim. The error story is more nuanced than the junior level admitted:
- Delimiter found in buffer:
(slice, nil). - Delimiter not found, but buffer not full:
bufio.Readercalls fill on the underlying reader and tries again. - Delimiter not found and buffer full:
(slice, bufio.ErrBufferFull)wheresliceis the entire buffer contents. - Underlying reader hit EOF before finding delimiter:
(slice, io.EOF)wheresliceis whatever was read (possibly empty).
The buffer-full case is recoverable: you can keep reading, and the delimiter may appear after more data. You can also widen the buffer (only by creating a new bufio.Reader with NewReaderSize) and re-attempt. ReadString and ReadBytes handle this internally by copying slice into a growing buffer and looping.
5. ReadLine — the inconvenient primitive¶
ReadLine is what Scanner uses internally. Its signature:
lineis a slice into the internal buffer (no allocation).isPrefixistrueif the line is longer than the buffer; the return is the first chunk and the rest comes on subsequent calls.errfollows the standard EOF rules.
ReadLine strips the trailing \r?\n from the returned bytes. To preserve it, use ReadBytes('\n') instead.
The isPrefix story is what makes ReadLine painful for direct use: you have to glue chunks together yourself, and the slice is invalid after the next call. Most code that wants line semantics goes through Scanner (which handles the gluing) or ReadString('\n') (which allocates a complete line).
6. The Scanner token-loss bug¶
bufio.Scanner with default settings has a 64 KiB token cap. When a token exceeds it, the scanner advances past the long token and returns bufio.ErrTooLong from Err(). The data of the offending token is lost — there is no way to recover it from the scanner.
// File has lines: "ok\n<70 KiB blob>\nalso ok\n"
s := bufio.NewScanner(f)
for s.Scan() {
fmt.Println(s.Text())
}
err := s.Err()
// Output: "ok\nalso ok\n", err = bufio.ErrTooLong
// The 70 KiB blob is gone.
Worse: in some older versions, the scanner stops on ErrTooLong without advancing past the bad token, so the next iteration would try to parse the same too-long token again. Modern (Go 1.22) behaviour is "advance and stop with error."
If your input might have long tokens:
- Raise the cap with
s.Buffer(make([]byte, 0, init), max). - Or switch to
bufio.Reader.ReadBytes('\n')for unbounded lines. - Or split with a custom
SplitFuncthat explicitly handles length-prefixed records, where length validation is the parser's responsibility.
Never silently retry past ErrTooLong and assume you got the data.
7. The Scanner "no progress" panic¶
If a custom SplitFunc returns (0, nil, nil) — "I need more data" — and the scanner has already read everything available without making progress, the scanner panics with bufio.ErrFinalToken (Go versions vary on exact sentinel; the message is "Scan called after Scan returned false" or "split function returned no progress").
A correct SplitFunc returns (0, nil, nil) only when more bytes might arrive. When atEOF is true, you must either yield the trailing data or return an error. Returning (0, nil, nil) with atEOF == true is the bug.
// CORRECT
if atEOF && len(data) > 0 {
return len(data), data, nil
}
return 0, nil, nil
// WRONG — at EOF with no progress causes a panic
return 0, nil, nil
8. ErrFinalToken semantics¶
bufio.ErrFinalToken is a sentinel that lets a SplitFunc end scanning cleanly with one last token:
When the scanner sees this:
- If
token != nil, it yields the token (nextScanreturns true). - The next
Scancall returnsfalse, andErr()returnsnil.
This is the only way to terminate scanning cleanly with a final token in hand. Returning (advance, token, io.EOF) is wrong because the scanner treats io.EOF as a real error and Err() reports it.
9. The bufio.Writer invariants¶
The internal state, simplified:
type Writer struct {
err error // sticky write error
buf []byte
n int // bytes used in buf
wr io.Writer
}
0 <= n <= len(buf).buf[:n]is the buffered data not yet sent towr.erris sticky: once any underlyingWritefails, all subsequent writes return the same error.Resetclears it.
The Flush method sends buf[:n]. If the underlying Write returns m < n with no error, that's an io.ErrShortWrite. If it returns an error, the writer is poisoned: err is set, n is updated to reflect how many bytes were actually sent, and future writes fail.
10. Flush is not "drain"¶
A successful Flush means the buffered bytes were handed to the underlying writer. It does not mean those bytes hit the disk, the network, or any final destination. For files, you need f.Sync() after bw.Flush() (see ../01-io-and-file-handling/senior.md section 4–5).
For TCP, Flush returns when the bytes are in the kernel send buffer. The peer might not have received them; the network might be down; the connection might already be closed. Flush on a writer over a closed TCP connection typically returns a *net.OpError wrapping EPIPE or similar.
11. Available and AvailableBuffer¶
func (b *Writer) Available() int // = len(b.buf) - b.n
func (b *Writer) AvailableBuffer() []byte // = b.buf[b.n:b.n] (Go 1.18+)
Available() is the free byte count. AvailableBuffer() returns the same memory as a zero-length slice with that capacity, suitable for append calls that fill the buffer in place.
The contract for AvailableBuffer:
- The returned slice is borrowed from the writer's buffer.
- Any other operation on the writer (
Write,Flush,WriteByte, etc.) invalidates the slice. - The caller is expected to
appendto the slice and pass the result tob.Write. The writer detects that the slice is its own buffer and setsndirectly without copying.
buf := bw.AvailableBuffer()
buf = strconv.AppendInt(buf, x, 10)
buf = append(buf, '\n')
bw.Write(buf) // no copy; just bumps b.n by len(buf)
If the appends grow buf past Available(), the slice escapes to the heap (Go's append rules) and Write falls back to a normal copy. That's correct but loses the optimisation; size the buffer to fit your typical record.
12. Buffered and partial flushes¶
Buffered() returns n, the number of bytes currently buffered. After a successful Flush, n == 0. After a partial flush (the underlying Write returned m < n), n is set to n - m and the bytes buf[:n] shift left to start at zero. Subsequent writes append to the new tail.
This means an io.ErrShortWrite does not lose data — the unwritten suffix is still buffered, and a subsequent Flush will retry it. Whether the retry succeeds depends on why the previous one failed; for network writes after EPIPE, retries don't help.
13. ReadFrom short-circuits the buffer¶
(*bufio.Writer).ReadFrom(r io.Reader) is the fast path for io.Copy where the destination is a bufio.Writer:
The implementation, simplified:
- Drain the buffer first (one underlying
Writeofbuf[:n]). - Loop: read directly into the buffer's free space, then write that from the buffer to the underlying writer.
- If the underlying writer also implements
ReaderFrom, hand off to it after the buffer drain.
The point: large copies don't double-buffer. The bytes do go through bufio's buffer briefly, but they are not split into 4 KiB chunks and re-assembled — the buffer is filled to capacity, written in one call, then refilled.
14. WriteTo on bufio.Reader¶
(*bufio.Reader).WriteTo(w io.Writer) is the symmetric fast path for io.Copy(w, br):
- Send
buf[r:w]towfirst (drain buffered bytes). - If the underlying reader implements
WriterTo, defer to it. - Otherwise, loop: read into
bufand write tow.
The total bytes returned is the sum across all phases. The first error from w.Write or rd.Read stops the loop and is returned (EOF is normalised to nil on the return).
15. The full method-by-method allocation table¶
| Method | Allocates? | Notes |
|---|---|---|
bufio.NewReader(r) | Yes (4096-byte buffer + Reader struct) | Once per call |
bufio.NewReaderSize(r, n) | Yes if n != existing buffer size | Reuses if r is already a *bufio.Reader of size >= n |
Reader.Read(p) | No | Possibly bypasses buffer if len(p) >= bufsize |
Reader.ReadByte | No | |
Reader.UnreadByte | No | |
Reader.ReadRune | No | |
Reader.UnreadRune | No | |
Reader.ReadSlice(delim) | No | Slice into buffer |
Reader.ReadBytes(delim) | Yes (one per call) | Copy of slice |
Reader.ReadString(delim) | Yes (one string per call) | |
Reader.ReadLine | No | Slice into buffer |
Reader.Peek(n) | No | Slice into buffer |
Reader.Discard(n) | No | |
Reader.Buffered | No | |
Reader.Reset(r) | No | |
bufio.NewWriter(w) | Yes | |
Writer.Write(p) | No | |
Writer.WriteByte(c) | No | |
Writer.WriteRune(r) | No | |
Writer.WriteString(s) | No | |
Writer.AvailableBuffer | No | Slice into buffer |
Writer.Flush | No | |
Writer.Reset(w) | No | |
Writer.ReadFrom(r) | No | Buffer-only |
bufio.NewScanner(r) | Yes (Scanner struct, no buffer yet) | |
Scanner.Scan | No (after first call) | First call allocates the buffer |
Scanner.Bytes | No | Slice into buffer |
Scanner.Text | Yes (one string per call) | |
Scanner.Buffer(buf, max) | Maybe | Replaces buffer; allocates if cap(buf) == 0 |
16. Reset reset semantics¶
For both Reader and Writer, Reset(x):
- Clears
errto nil. - Resets
r,w(Reader) orn(Writer) to zero. - Discards any buffered bytes — they are lost.
- Reuses the existing internal buffer.
For Reader, this means: after br.Reset(other), any unread bytes that were buffered from the previous source are gone. Don't Reset in the middle of a stream and expect the bytes to come back.
For Writer, this is more dangerous: any unflushed bytes in the buffer are silently dropped. Reset does not flush. If you forget to Flush before Reset, you lose data without an error.
// CORRECT
bw.Flush()
bw.Reset(newWriter)
// WRONG — silently discards bw's pending bytes
bw.Reset(newWriter)
17. Scanner cannot be reused¶
bufio.Scanner does not have a Reset method. Once it has scanned a source to completion (or to error), you must allocate a new bufio.Scanner to scan another source. Pooling scanners is harder than pooling readers/writers — it requires managing the underlying bufio.Reader separately and reconstructing the scanner around it.
In practice, the cost of allocating a bufio.Scanner is small (the struct is ~88 bytes plus the buffer it lazily allocates). Don't bend over backwards to pool them; pool the underlying *os.File / net.Conn instead.
18. Concurrency, exactly¶
| Type | Safe for concurrent calls? |
|---|---|
*bufio.Reader | No. All methods touch shared state. |
*bufio.Writer | No. All methods touch shared state. |
*bufio.Scanner | No. |
*bufio.ReadWriter | No (it's just two unsafe values). |
The standard rule: one bufio.* value per goroutine. If you need to share a connection across goroutines, build a worker pattern — one goroutine owns the bufio reader and dispatches messages via channels.
The Reset method is not a synchronisation point. Calling Reset from one goroutine while another is mid-Read is a data race that the Go race detector will report.
19. bufio.Writer.WriteRune and the \xC0 trap¶
WriteRune(r) writes utf8.RuneLen(r) bytes for valid runes. For r < 0 or r > utf8.MaxRune (invalid UTF-8), it writes the UTF-8 encoding of utf8.RuneError (which is \xEF\xBF\xBD, the Unicode replacement character). It does not return an error for an invalid rune — the substitution is silent.
If you need strict validation, check with utf8.ValidRune first.
20. bufio.Reader.ReadRune and surrogate pairs¶
ReadRune returns one rune per call. If the source contains an invalid UTF-8 sequence, ReadRune returns (utf8.RuneError, 1, nil) — one byte consumed, the replacement rune yielded. This is the same behaviour as utf8.DecodeRune.
UnreadRune after ReadRune returned RuneError works only if the read was a valid rune. After an invalid sequence, UnreadRune returns bufio.ErrInvalidUnreadRune.
21. Scanner.Buffer ordering rule¶
Scanner.Buffer(buf, max) must be called before the first Scan. After the first Scan, calling Buffer panics with "Buffer called after Scan." The reason: the scanner has already started using its internal buffer, and reseating it mid-scan would lose buffered data.
The same rule applies to Scanner.Split. Configure your scanner fully, then start the loop.
22. The interaction of bufio.Writer and O_APPEND¶
A file opened with os.O_APPEND causes every kernel write(2) to seek to end-of-file before writing. This is atomic per-syscall — multiple processes appending to the same file get whole-write interleaving, never byte-interleaving (as long as each write(2) is under PIPE_BUF, typically 4 KiB).
A bufio.Writer over an O_APPEND file batches writes. When Flush runs, the whole buffer is written in one syscall. If the buffer is under PIPE_BUF, multi-process appending is still safe. Above that, the kernel may split the write into multiple write(2) calls, each of which seeks-to-end independently — interleaving with other appenders.
For multi-process log appending, keep bufio.Writer.Size() at or below PIPE_BUF (4 KiB on Linux). Or skip bufio and write directly with O_APPEND, accepting one syscall per record.
23. bufio.Writer does not implement Close¶
There is no Close method on *bufio.Writer. Code that does bw.Close() does not compile. The decision is intentional: bufio doesn't own the underlying writer; it only buffers. Closing the underlying is the caller's responsibility, after Flush.
If you compose a bufio.Writer into a struct that wraps a closeable, your struct's Close method is the right place to do Flush(); underlying.Close(). Stdlib examples: zip.Writer, gzip.Writer, csv.Writer all do something similar.
24. The order of operations for a layered output stack¶
Close order, in defer (LIFO):
f, _ := os.Create("out.json.gz")
defer f.Close() // 4: closes file
bw := bufio.NewWriter(f)
defer bw.Flush() // 3: pushes bytes to f
gz := gzip.NewWriter(bw)
defer gz.Close() // 2: writes gzip trailer to bw
enc := json.NewEncoder(gz) // no Close needed
// ... encode ...
Reverse the encoder (1) → gzip (2) → bufio (3) → file (4) order and the file ends up missing the gzip trailer or the last few KiB of data. Get the order right, check every error, and the layered stack works.
The correct version with error checking:
func writeJSONGZ(path string, items []Item) (err error) {
f, err := os.Create(path)
if err != nil { return err }
defer func() {
if cerr := f.Close(); err == nil { err = cerr }
}()
bw := bufio.NewWriter(f)
gz := gzip.NewWriter(bw)
enc := json.NewEncoder(gz)
for _, it := range items {
if err = enc.Encode(it); err != nil { return err }
}
if err = gz.Close(); err != nil { return err }
return bw.Flush()
}
25. What to read next¶
- professional.md — large-scale production patterns.
- specification.md — every method and error in tables.
- optimize.md — measured guidance on buffer sizes.
- find-bug.md — drills targeting items in this file.
- The
bufiopackage source is small and worth reading. Start at src/bufio/bufio.go.