8.12 The encoding Family — Find the Bug¶
Each section presents a short snippet that looks fine. There's a bug. The bug is one of the traps from junior.md, middle.md, or senior.md. Find it before reading the explanation.
Bug 1: Truncated base64 file¶
func encodeFile(src, dst string) error {
in, err := os.Open(src)
if err != nil { return err }
defer in.Close()
out, err := os.Create(dst)
if err != nil { return err }
defer out.Close()
enc := base64.NewEncoder(base64.StdEncoding, out)
if _, err := io.Copy(enc, in); err != nil {
return err
}
return nil
}
What's wrong?
Bug: enc.Close() is never called. base64.NewEncoder buffers up to 2 trailing source bytes; without Close they're lost, and the file is missing its final group plus padding.
Fix:
defer in.Close()
defer out.Close()
enc := base64.NewEncoder(base64.StdEncoding, out)
defer enc.Close()
if _, err := io.Copy(enc, in); err != nil {
return err
}
return enc.Close() // explicit close to surface the error
The defer-and-explicit-close pattern: defer for cleanup on panic/error path, explicit close for the happy path so the error is checked. (Defers also run in LIFO order — enc.Close will run before out.Close, which is what you want.)
Bug 2: CSV with European decimals¶
func sumColumn(r io.Reader, col int) (float64, error) {
cr := csv.NewReader(r)
var sum float64
for {
rec, err := cr.Read()
if err == io.EOF { break }
if err != nil { return 0, err }
v, err := strconv.ParseFloat(rec[col], 64)
if err != nil { return 0, err }
sum += v
}
return sum, nil
}
The input is a German bank export with ; as the field separator and , as the decimal point: "1.234,56;Schmidt". What goes wrong?
Bug 1: The default Comma is ,, so the parser splits at every comma — including the one inside 1.234,56.
Bug 2: strconv.ParseFloat doesn't understand , as the decimal point. Even if you fix the separator, the values won't parse.
Fix:
cr := csv.NewReader(r)
cr.Comma = ';'
// ...
v, err := strconv.ParseFloat(strings.Replace(rec[col], ",", ".", 1), 64)
Or use a localization library (golang.org/x/text/message) for proper locale-aware parsing.
Bug 3: XML round-trip changes the document¶
type Item struct {
XMLName xml.Name `xml:"item"`
ID string `xml:"id,attr"`
Note string `xml:",chardata"`
}
func roundtrip(src []byte) ([]byte, error) {
var i Item
if err := xml.Unmarshal(src, &i); err != nil {
return nil, err
}
return xml.Marshal(i)
}
Input: <item id="42">Hello & goodbye</item>.
What's the output?
Bug: Output is <item id="42">Hello & goodbye</item> — which is correct for the literal string but not for the encoded representation.
Wait, that's actually fine. Let me reconsider.
The bug is more subtle. The input has &; on Unmarshal, Note becomes the decoded string "Hello & goodbye". On Marshal, the encoder re-escapes the & to &. So the round-trip is byte-equivalent for this case.
Now try input <item id="42">Hello <![CDATA[<world>]]></item>.
Note becomes "Hello <world>" — the CDATA section is decoded. On marshal, the encoder produces <item id="42">Hello <world></item> — escaped, no CDATA. The character data round-trips, but the syntactic form (CDATA section) is lost.
Fix: there isn't a clean one with the structured marshaler. If you must preserve CDATA literally, use ,innerxml to capture the raw inner content:
type Item struct {
XMLName xml.Name `xml:"item"`
ID string `xml:"id,attr"`
Inner string `xml:",innerxml"` // raw bytes, no decode/encode
}
Inner will hold the literal Hello <![CDATA[<world>]]> bytes, ready to be re-emitted unchanged.
Bug 4: Gob across a refactor¶
// In service v1, package "github.com/example/old":
package old
type Login struct{ User string }
func init() { gob.Register(Login{}) }
// In service v2, package was moved:
package new
type Login struct{ User string }
func init() { gob.Register(Login{}) }
The v2 service can't decode v1 messages. Why?
Bug: gob.Register(Login{}) uses the type's package-qualified name as the registry key. v1 registers "github.com/example/old.Login" and v2 registers "github.com/example/new.Login". The names differ, so the v2 decoder doesn't recognize the v1 wire form.
Fix: use gob.RegisterName with a stable key in both versions:
Now both register the same name and the wire form is portable. If v1 has already shipped with the auto-generated name, you must register both names on v2:
Bug 5: ReuseRecord trap¶
func loadRows(r io.Reader) ([][]string, error) {
cr := csv.NewReader(r)
cr.ReuseRecord = true
var out [][]string
for {
rec, err := cr.Read()
if err == io.EOF { break }
if err != nil { return nil, err }
out = append(out, rec)
}
return out, nil
}
Caller observes that every entry in out contains the same data as the last record. Why?
Bug: With ReuseRecord = true, cr.Read() returns the same slice header (and the same backing array) on every call. Every out = append(out, rec) pushes a slice that aliases the same memory. After the loop, all entries point to the last record's data.
Fix: copy the record:
Or turn ReuseRecord off (the default) and accept the per-record allocation.
Bug 6: PEM block type¶
func loadCert(b []byte) (*x509.Certificate, error) {
block, _ := pem.Decode(b)
if block == nil {
return nil, errors.New("not PEM")
}
return x509.ParseCertificate(block.Bytes)
}
User uploads a private key file. What's the error message?
Bug: The function doesn't check block.Type. If the user uploads -----BEGIN RSA PRIVATE KEY-----, the function tries to parse the key bytes as an x509 certificate, returning an opaque error like x509: malformed certificate instead of the helpful "that's not a certificate."
Fix:
block, _ := pem.Decode(b)
if block == nil {
return nil, errors.New("not PEM")
}
if block.Type != "CERTIFICATE" {
return nil, fmt.Errorf("expected CERTIFICATE, got %q", block.Type)
}
return x509.ParseCertificate(block.Bytes)
Bug 7: Length-prefix DOS¶
func readMessage(r io.Reader) ([]byte, error) {
var hdr [4]byte
if _, err := io.ReadFull(r, hdr[:]); err != nil {
return nil, err
}
n := binary.BigEndian.Uint32(hdr[:])
buf := make([]byte, n)
if _, err := io.ReadFull(r, buf); err != nil {
return nil, err
}
return buf, nil
}
What goes wrong with a malicious peer?
Bug: The peer sends 0xFFFFFFFF (4 GiB) and the function allocates a 4 GiB buffer before realizing the peer can't actually deliver that many bytes. Memory blow-up.
Fix: cap the length before allocating:
const maxMessageSize = 16 << 20 // 16 MiB
n := binary.BigEndian.Uint32(hdr[:])
if n > maxMessageSize {
return nil, fmt.Errorf("message %d > max %d", n, maxMessageSize)
}
buf := make([]byte, n)
Bug 8: XML omitempty on time¶
type Audit struct {
Created time.Time `xml:"created,omitempty"`
Updated time.Time `xml:"updated,omitempty"`
}
a := Audit{Updated: time.Now()}
b, _ := xml.Marshal(a)
The user expected Created to be absent. Output?
Bug: XML's ,omitempty (like JSON's) doesn't omit time.Time{} because time.Time is a struct, not a scalar. The output:
Both fields are present.
Fix: use a pointer:
type Audit struct {
Created *time.Time `xml:"created,omitempty"`
Updated *time.Time `xml:"updated,omitempty"`
}
Now nil pointers are omitted.
Bug 9: Hex of a hash¶
What does the developer expect, and what do they get?
Bug: Two issues:
hex.Encodereturns the number of bytes written but the result variable is discarded. The caller has no way to access the encoded output.- The destination buffer is only 32 bytes, but hex of 32 input bytes needs 64 output bytes.
hex.Encodepanics with "out of range" or returns a number larger than the buffer (depending on which slice bounds check trips first).
Fix:
sum := sha256.Sum256([]byte("hello"))
fmt.Println(hex.EncodeToString(sum[:]))
// 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
hex.EncodeToString allocates the right-sized buffer for you.
Bug 10: Gob send-the-concrete¶
func send(enc *gob.Encoder, e Event) error {
return enc.Encode(e)
}
func recv(dec *gob.Decoder) (Event, error) {
var e Event
err := dec.Decode(&e)
return e, err
}
Event is an interface; Login implements it. Sending works, but recv returns gob: name not registered for interface: "". Why?
Bug: enc.Encode(e) passes the dynamic Login value, not an interface — gob sees a concrete type and writes it as a Login, without recording the dynamic type name. The decoder, expecting an interface, finds no type name and fails.
Fix: pass &e (a pointer to the interface):
Gob's rule: for the polymorphic interface path, encode a pointer to an interface value. The pointer-to-interface preserves the "this is an interface" type information; the bare interface value is dereferenced to its concrete type.
Bug 11: Base64 alphabet mix¶
func decodeJWT(token string) ([]byte, error) {
parts := strings.Split(token, ".")
if len(parts) != 3 {
return nil, errors.New("not a JWT")
}
return base64.URLEncoding.DecodeString(parts[1])
}
JWT decode returns *base64.CorruptInputError on a perfectly valid JWT. Why?
Bug: JWT segments use RawURLEncoding (no padding). The function uses URLEncoding (with padding). When the segment's bit count isn't a multiple of 6, URLEncoding expects = chars that aren't there, and decoding fails.
Fix:
Some JWT-handling code is more lenient — manually adds = padding to make the length a multiple of 4, then uses URLEncoding. That works too, but RawURLEncoding is cleaner.
Bug 12: XML with a typo'd tag¶
type Config struct {
Host string `xml:"host"`
Port int `xml:"prot"` // typo: should be "port"
}
src := []byte(`<Config><host>db</host><port>5432</port></Config>`)
var c Config
xml.Unmarshal(src, &c)
fmt.Printf("%+v\n", c)
What's the output?
Bug: The Go field tag says prot, but the XML says port. Unknown elements are silently dropped by encoding/xml. Output: {Host:db Port:0}. The typo is invisible.
Fix: there's no built-in strict mode for unknown elements (unlike JSON's DisallowUnknownFields). Options:
- Code review and tests that assert all XML tags decode correctly.
- Implement custom
UnmarshalXMLthat walks tokens explicitly and errors on unknown elements. - Use a third-party schema validator if you have an XSD.
Bug 13: Forgetting Flush on the CSV writer¶
func export(rows []Row, path string) error {
f, err := os.Create(path)
if err != nil { return err }
defer f.Close()
cw := csv.NewWriter(f)
cw.Write([]string{"id", "name"})
for _, r := range rows {
cw.Write([]string{strconv.Itoa(r.ID), r.Name})
}
return nil
}
File is empty after running. Why?
Bug: csv.Writer buffers internally. Without Flush, the buffered bytes are never written to f. f.Close (via defer) fires but doesn't flush the CSV writer.
Also, errors from Write are silently dropped — csv.Writer queues errors for Error() to surface. And the function returns nil even when writing fails.
Fix:
defer f.Close()
cw := csv.NewWriter(f)
cw.Write([]string{"id", "name"})
for _, r := range rows {
cw.Write([]string{strconv.Itoa(r.ID), r.Name})
}
cw.Flush()
if err := cw.Error(); err != nil {
return err
}
return f.Close()
The final return f.Close() matters too — write errors can surface late, when the OS flushes the underlying buffer.
Bug 14: Endianness¶
func writeUint16(w io.Writer, v uint16) error {
var buf [2]byte
binary.LittleEndian.PutUint16(buf[:], v)
_, err := w.Write(buf[:])
return err
}
func readUint16(r io.Reader) (uint16, error) {
var buf [2]byte
if _, err := io.ReadFull(r, buf[:]); err != nil {
return 0, err
}
return binary.BigEndian.Uint16(buf[:]), nil
}
Round-trip of 0x1234 returns 0x3412. Why?
Bug: Writer uses LittleEndian, reader uses BigEndian. Bytes get reversed.
Fix: pick one and stick with it. For new protocols going across the network, BigEndian is conventional. For interop with x86- native formats, LittleEndian.
This is the boring bug that catches everyone in code review at least once. Use a constant:
var byteOrder = binary.BigEndian
func writeUint16(w io.Writer, v uint16) error {
var buf [2]byte
byteOrder.PutUint16(buf[:], v)
_, err := w.Write(buf[:])
return err
}
Now changing it is a one-liner.
Bug 15: ASCII85 streaming¶
func encodeASCII85(src io.Reader, dst io.Writer) error {
enc := ascii85.NewEncoder(dst)
if _, err := io.Copy(enc, src); err != nil {
return err
}
return nil
}
Output is missing the trailing characters. Why?
Bug: Same family as base64 — ascii85.NewEncoder returns a WriteCloser that buffers up to 3 trailing source bytes until Close flushes them. Without Close, the last group is lost.
Fix:
enc := ascii85.NewEncoder(dst)
defer enc.Close()
if _, err := io.Copy(enc, src); err != nil {
return err
}
return enc.Close()
The pattern is identical to base64. Any encoder whose group size doesn't equal 1 needs a Close for the trailing group.
Bug 16: PEM trailing data¶
func parseSinglePEM(b []byte, expected string) (*pem.Block, error) {
block, _ := pem.Decode(b)
if block == nil {
return nil, errors.New("not PEM")
}
if block.Type != expected {
return nil, fmt.Errorf("got %q", block.Type)
}
return block, nil
}
The function is supposed to enforce "exactly one PEM block." A caller passes a file with two CERTIFICATE blocks; the function returns the first one without complaint. Why?
Bug: The rest return value from pem.Decode is discarded. The function doesn't check whether there's more PEM-shaped data after the first block.
Fix:
block, rest := pem.Decode(b)
if block == nil {
return nil, errors.New("not PEM")
}
if block.Type != expected {
return nil, fmt.Errorf("got %q, want %q", block.Type, expected)
}
if next, _ := pem.Decode(rest); next != nil {
return nil, errors.New("expected single PEM block, found more")
}
return block, nil
pem.Decode(rest) != nil means another block was found. If the trailing content is just whitespace or unrelated text, Decode returns nil and the check passes.
Bug 17: XML namespace mismatch¶
type Feed struct {
XMLName xml.Name `xml:"feed"`
Title string `xml:"title"`
}
src := []byte(`<feed xmlns="http://www.w3.org/2005/Atom"><title>Hi</title></feed>`)
var f Feed
xml.Unmarshal(src, &f)
fmt.Printf("%+v\n", f)
Title ends up empty. Why?
Bug: The XML declares the Atom namespace, so every element inside has the xml.Name{Space: "http://www.w3.org/2005/Atom", Local: "title"} tag. The struct field's tag is just "title", which matches xml.Name{Space: "", Local: "title"} — different namespace. The decoder sees the elements as not matching and leaves the field empty.
Fix:
type Feed struct {
XMLName xml.Name `xml:"http://www.w3.org/2005/Atom feed"`
Title string `xml:"http://www.w3.org/2005/Atom title"`
}
Or omit the namespace from struct fields and provide a default on the decoder:
dec := xml.NewDecoder(bytes.NewReader(src))
dec.DefaultSpace = "http://www.w3.org/2005/Atom"
dec.Decode(&f)
DefaultSpace says "if a field's tag has no namespace, treat it as this." Convenient for feeds in a single namespace.
Bug 18: Binary read into a slice-bearing struct¶
type Frame struct {
Magic uint32
Length uint32
Payload []byte
}
func read(r io.Reader, f *Frame) error {
return binary.Read(r, binary.BigEndian, f)
}
What happens?
Bug: binary.Read panics with "binary.Read: invalid type []uint8". Slices aren't fixed-size, so binary.Read rejects them.
Fix: read the fixed part with binary.Read, then read the variable part separately:
func read(r io.Reader, f *Frame) error {
if err := binary.Read(r, binary.BigEndian, &f.Magic); err != nil { return err }
if err := binary.Read(r, binary.BigEndian, &f.Length); err != nil { return err }
if f.Length > maxLength {
return fmt.Errorf("length %d > max %d", f.Length, maxLength)
}
f.Payload = make([]byte, f.Length)
_, err := io.ReadFull(r, f.Payload)
return err
}
Or skip binary.Read entirely and use the byte-slice helpers for the fixed header:
var hdr [8]byte
io.ReadFull(r, hdr[:])
f.Magic = binary.BigEndian.Uint32(hdr[0:4])
f.Length = binary.BigEndian.Uint32(hdr[4:8])
Bug 19: TextMarshaler on a value, UnmarshalText on a value¶
type Color int
func (c Color) MarshalText() ([]byte, error) { ... }
func (c Color) UnmarshalText(b []byte) error { *c = ...; return nil } // value receiver
Compile error?
Bug: *c = ... requires c to be addressable. A value receiver makes c a copy, and the assignment can't escape. The code doesn't compile.
Fix: pointer receiver for UnmarshalText:
Convention: Marshal* methods can be on value receivers (they read); Unmarshal* methods must be on pointer receivers (they write). The JSON, XML, and gob packages enforce this implicitly via the type of interface they look up.
Bug 20: Hex with whitespace¶
func decodeHex(s string) ([]byte, error) {
return hex.DecodeString(s)
}
decodeHex("de ad be ef") // error
Why?
Bug: hex.DecodeString doesn't tolerate whitespace — hex.InvalidByteError. The hex.NewDecoder(r) reader does.
Fix: strip whitespace first, or use the streaming decoder:
func decodeHex(s string) ([]byte, error) {
s = strings.Map(func(r rune) rune {
if unicode.IsSpace(r) { return -1 }
return r
}, s)
return hex.DecodeString(s)
}
Or:
The streaming decoder is whitespace-tolerant, the string decoder is not. This asymmetry catches everyone exactly once.