Strings in Go — Find the Bug¶
Bug 1 — Wrong character count 🟢¶
package main
import "fmt"
func main() {
s := "Hello, 世界"
fmt.Printf("The string has %d characters\n", len(s))
}
What is the expected output? "The string has 9 characters" What does it actually print? "The string has 13 characters"
Explanation and Fix
**Bug:** `len(s)` counts bytes, not characters. "世界" is 6 bytes (3 each), not 2. **Fix:** **Lesson:** Always use `utf8.RuneCountInString` or a range loop for character counts.Bug 2 — Garbled character access 🟢¶
Expected: 世 Actual: A garbled byte character (228 = first byte of "世")
Explanation and Fix
**Bug:** `s[7]` returns the byte at index 7, not the 8th rune. "世" starts at byte 7 but spans 3 bytes. **Fix:** Or iterate with range and count runes:Bug 3 — Attempted string mutation 🟢¶
package main
import "fmt"
func capitalize(s string) string {
s[0] = s[0] - 32 // make first letter uppercase
return s
}
func main() {
fmt.Println(capitalize("hello"))
}
Expected: "Hello" Actual: Compile error: cannot assign to s[0] (value of type byte)
Explanation and Fix
**Bug:** Strings are immutable in Go. You cannot assign to `s[i]`. **Fix:** Or more robustly:Bug 4 — Quadratic string building 🟡¶
package main
import "fmt"
func buildNumbers(n int) string {
result := ""
for i := 0; i < n; i++ {
result += fmt.Sprintf("%d,", i)
}
return result
}
func main() {
fmt.Println(buildNumbers(5)) // "0,1,2,3,4,"
}
Expected: Works correctly but is extremely slow for large n. Problem: O(n^2) allocations.
Explanation and Fix
**Bug:** Each `+=` creates a new string and copies all previous content. For n=10000, this allocates ~50MB of temporary strings. **Fix:**Bug 5 — strings.Split empty case 🟡¶
package main
import (
"fmt"
"strings"
)
func firstField(s string) string {
parts := strings.Split(s, ",")
return parts[0]
}
func main() {
fmt.Println(firstField("a,b,c")) // a
fmt.Println(firstField("")) // ??? panic?
}
Expected: firstField("") returns "" Actual: Returns "" (no panic here, but the code has a conceptual issue)
Explanation and Fix
**Issue:** `strings.Split("", ",")` returns `[""]` — a slice with one empty string. So `parts[0]` is `""`. This is actually correct behavior, but developers are often surprised. However, the real bug pattern is: **Fix:** Use `strings.Cut` for splitting into exactly two parts: Or check length:Bug 6 — Copying strings.Builder 🟡¶
package main
import (
"fmt"
"strings"
)
func buildString() string {
var sb strings.Builder
sb.WriteString("Hello")
sb2 := sb // copy the builder
sb2.WriteString(", World!")
return sb2.String()
}
func main() {
fmt.Println(buildString())
}
Expected: "Hello, World!" Actual: panic: strings: illegal use of non-zero Builder copied by value
Explanation and Fix
**Bug:** `strings.Builder` cannot be copied after first use. The `addr *Builder` self-pointer detects the copy and panics. **Fix:** If you need to fork, start a new builder:Bug 7 — Substring memory leak 🔴¶
package main
import (
"fmt"
"strings"
)
func extractName(doc string) string {
idx := strings.Index(doc, "name:")
if idx < 0 {
return ""
}
start := idx + 5
end := strings.Index(doc[start:], "\n")
if end < 0 {
return doc[start:]
}
return doc[start : start+end]
}
func parseConfig(filename string) string {
// Imagine reading a 50MB config file
doc := strings.Repeat("padding...\n", 500000) + "name:Alice\n"
return extractName(doc)
}
func main() {
name := parseConfig("config.txt")
fmt.Println(name) // Alice
// But 50MB is still in memory!
}
Problem: The returned substring keeps the entire 50MB document alive.
Explanation and Fix
**Bug:** `doc[start : start+end]` is a substring that shares the backing array of the 50MB `doc` string. As long as the returned name is reachable, the 50MB doc cannot be GC'd. **Fix:** Use `strings.Clone` to create an independent copy: This allocates a small independent string for "Alice" and allows the 50MB doc to be collected. **Added in:** Go 1.20 — `strings.Clone(s string) string`Bug 8 — Invalid UTF-8 causing RuneError 🔴¶
package main
import "fmt"
func reverseString(s string) string {
runes := []rune(s)
for i, j := 0, len(runes)-1; i < j; i, j = i+1, j-1 {
runes[i], runes[j] = runes[j], runes[i]
}
return string(runes)
}
func main() {
// This arrives from an external source with invalid UTF-8
bad := "hello\xffworld"
fmt.Println(reverseString(bad))
}
Problem: \xff is invalid UTF-8. Converting to []rune replaces it with utf8.RuneError (U+FFFD), silently corrupting the data.
Explanation and Fix
**Bug:** When you convert invalid UTF-8 bytes to `[]rune`, invalid sequences become `utf8.RuneError` (U+FFFD = 0xFFFD). Converting back to string then encodes RuneError as a valid 3-byte UTF-8 sequence, changing the byte content. **Fix:** Validate UTF-8 before processing:Bug 9 — Case-insensitive comparison done wrong 🟡¶
package main
import (
"fmt"
"strings"
)
func main() {
a := "Straße" // German word for "street"
b := "STRASSE"
// Attempt case-insensitive comparison
if strings.ToLower(a) == strings.ToLower(b) {
fmt.Println("equal")
} else {
fmt.Println("not equal")
}
}
Expected: "equal" (these are the same word in different cases in German) Actual: "not equal" — strings.ToLower("STRASSE") = "strasse", but strings.ToLower("Straße") = "straße"
Explanation and Fix
**Bug:** `strings.ToLower` does simple Unicode lowercasing. In German, the uppercase of "ß" is "SS" (two characters). So "Straße" and "STRASSE" are case-equivalent but `ToLower` does not handle this folding. `strings.EqualFold` also does not handle this case for German. **Fix for basic ASCII:** Use `strings.EqualFold(a, b)`. **Fix for full Unicode folding:**Bug 10 — Off-by-one in string slicing 🟡¶
package main
import (
"fmt"
"strings"
)
// extractBetween returns the content between start and end markers
func extractBetween(s, start, end string) string {
i := strings.Index(s, start)
if i < 0 {
return ""
}
j := strings.Index(s[i:], end)
if j < 0 {
return ""
}
return s[i:j]
}
func main() {
s := "Hello [World] !"
fmt.Println(extractBetween(s, "[", "]")) // Expected: World
}
Expected: "World" Actual: "Hello [World" or wrong content
Explanation and Fix
**Bug 1:** The start index `i` points to the `[` character itself. To get content after `[`, use `i + len(start)`. **Bug 2:** The end search `strings.Index(s[i:], end)` searches from `[` onwards, giving a relative index `j`. To get the absolute position, use `i + j`. But we also need to skip past the start marker. **Fix:**Bug 11 — strings.Replace unexpected behavior 🟢¶
package main
import (
"fmt"
"strings"
)
func main() {
s := "aababab"
result := strings.Replace(s, "ab", "X", 0)
fmt.Println(result) // Expected: "aXXX"? or "aababab"?
}
Expected by developer: Replace all "ab" occurrences Actual: Returns "aababab" unchanged