cgo Basics — Senior¶

1. Cgo's three costs¶

Senior cgo work is mostly negotiating three costs:

Per-call overhead (~100 ns) — bad for hot loops.
GC scheduler interaction — long blocking C calls hold an OS thread.
Build complexity — cross-compilation, static linking, supply chain.

If your design pays each cost only where it adds value, cgo is a fine tool. If you sprinkle it across the codebase, build times and correctness debts compound.

2. Batching at the cgo boundary¶

// 1000 cgo calls, ~100 µs of overhead alone
for _, x := range data {
    C.process_one(C.int(x))
}

// 1 cgo call, processes all data in C
C.process_batch((*C.int)(unsafe.Pointer(&data[0])), C.size_t(len(data)))

The C function does the loop. The Go side hands over the buffer and waits. For workloads that already have a "process N items" API in the C library, this is the only reasonable shape.

3. Long-running C calls and `M`-thread starvation¶

A goroutine blocked in a long C call holds an OS thread (M). With many such goroutines, the runtime spawns extra threads to keep Go work flowing — but there's a limit (GOMAXPROCS * a multiplier, see runtime/debug.SetMaxThreads).

Symptom: a service stops servicing requests after some threshold of concurrent cgo calls.

Mitigations:

Bounded worker pool for cgo work, sized to expected concurrency.
Async C APIs where available (use a callback or completion queue).
Move work to a separate process for truly long-running C tasks.

4. `runtime.LockOSThread`, deeply¶

The "lock to OS thread" semantics:

Until UnlockOSThread (or goroutine exit), this goroutine runs on a specific OS thread.
That OS thread cannot run other Go work.
New goroutines spawned from this one are not inherited; they run on any thread.

When this is necessary:

OpenGL contexts (the context is bound to a thread; OpenGL calls must run there).
JNI (Java's JNIEnv* is per-thread).
Some signal-handling setups.
Libraries that store data in errno-like thread-local storage if you call them across multiple yielding points.

For each, lock-on-entry / unlock-on-exit is the canonical pattern.

5. The pointer-passing rules, properly¶

Rule (from cmd/cgo docs):

Go code may pass a Go pointer to C provided the Go memory to which it points does not contain any Go pointers. C code must not store a Go pointer in Go memory, even temporarily. C code must not keep a copy of a Go pointer after the call returns.

Why: the GC is non-moving in current Go, but the runtime reserves the right to move stacks during growth. A Go pointer the runtime can't trace is liable to becoming stale.

Defenses:

For complex data, allocate via C.malloc, fill from Go, pass the C pointer, free when done.
For simple scalars, just pass the value (no pointers involved).
For slices of structs containing pointers, encode them into a byte buffer the C side parses.

The runtime checks pointer rules dynamically at the cgo boundary (you can disable with GODEBUG=cgocheck=0, but don't).

6. Cgo and the race detector¶

go build -race instruments Go memory accesses. C code is not instrumented. Race conditions that span C and Go are invisible to the detector.

In practice:

Synchronize cgo state carefully — Go mutexes work for Go callers; C code must do its own synchronization.
A C library that's not thread-safe needs serialization on the Go side (one goroutine at a time, or LockOSThread + single thread).

7. C++ in cgo¶

Set CGO_CXXFLAGS and use extern "C" for any C++ functions you want to expose:

// helper.cpp
extern "C" int multiply(int a, int b) { return a * b; }

// #cgo CXXFLAGS: -std=c++17
// #cgo LDFLAGS: -lstdc++
// extern int multiply(int a, int b);
import "C"

n := C.multiply(3, 4)

C++ name mangling is the main hurdle; extern "C" removes it. Don't try to expose C++ classes directly to Go.

8. Memory ownership patterns¶

Pattern	Lifetime
`C.CString` returned from Go → C call	Go owns; `defer C.free`
C function returns `char*`	C owns; convert with `C.GoString` (copies) before C frees
`C.malloc` from Go → fill → pass to C	Go owns; `defer C.free`
Long-lived C struct accessed from Go	C owns; treat the Go-side handle as opaque
Go slice passed to C for a single call	Go owns; valid for call duration only

The error class to avoid: Go pointers stashed in C memory or kept alive past the call.

9. Errors across the boundary¶

C functions usually communicate errors via:

A negative return value and an errno/GetLastError field.
A nullable output struct.
A string buffer the caller passes in.

buf := make([]byte, 256)
ret := C.libfoo_do_thing(arg, (*C.char)(unsafe.Pointer(&buf[0])), C.size_t(len(buf)))
if ret < 0 {
    msg := C.GoString((*C.char)(unsafe.Pointer(&buf[0])))
    return fmt.Errorf("libfoo: %s", msg)
}

Errors should be translated into Go errors at the boundary; the rest of the program shouldn't know about C return codes.

10. Building cgo as a static binary¶

go build -ldflags='-linkmode=external -extldflags="-static"' ./...

Or with CC=musl-gcc and CGO_ENABLED=1 for fully-static binaries against musl. Painful to set up; the result is portable across Linux distros.

For most production deployments, prefer CGO_ENABLED=0 and a pure-Go binary unless the C library is essential.

11. Cgo overhead in microbenchmarks¶

func BenchmarkCgoCall(b *testing.B) {
    for i := 0; i < b.N; i++ {
        C.cheap_noop()
    }
}

// Result: ~100 ns/op

For comparison:

Pure Go function call: ~1 ns.
Interface dispatch: ~2 ns.
Map lookup: ~30 ns.
mutex lock/unlock: ~20 ns.

100 ns is 100× a Go call. Use that ratio when deciding whether cgo is worth it for a particular function.

12. Cgo and PGO¶

Profile-guided optimization mostly benefits Go-to-Go calls. Cgo calls remain opaque to the Go compiler — PGO doesn't reach inside C. If your hot path is dominated by C work, PGO gains will be modest.

13. Cgo and modules¶

The C preamble's #include paths are resolved by the C preprocessor at build time. Headers must be on the system, in your module's vendor directory if vendored, or specified via -I paths. Cgo does not manage C dependencies; go mod doesn't know about them.

For reproducible builds, vendor the C source and provide it in the module:

mypkg/
  go.mod
  api.go
  libfoo/         # vendored C source
    foo.h
    foo.c

// #cgo CFLAGS: -I${SRCDIR}/libfoo
// #cgo LDFLAGS: -L${SRCDIR}/libfoo -lfoo
import "C"

${SRCDIR} is replaced at build time with the directory of the cgo file.

14. When to invest vs. pivot¶

Invest in cgo when:

You depend on a mature C library with no Go equivalent (image codecs, ML runtimes, OS bindings).
The C call boundary is wide (one call processes many items).
Build complexity is acceptable for your deployment.

Pivot away from cgo when:

The pure-Go alternative is within an acceptable performance margin.
Cross-compilation and static binaries matter.
The C library is a maintenance burden (supply chain, security updates).

Maintained Go alternatives have grown for many ecosystems: pure-go SQLite implementations, golang.org/x/crypto, etc.

15. Summary¶

Senior cgo work is about boundaries: making the call boundary efficient (batched), respecting pointer rules, locking OS threads when required, and isolating cgo behavior to a small, well-tested package. Build complexity and platform portability are real costs to weigh against the convenience of using a C library directly.