Parser & AST — Professional¶
0. The landscape of real tools¶
Before writing your own, know what production tools already do this — and what techniques they use, because you'll borrow them:
| Tool | Technique |
|---|---|
gofmt / gofumpt | parse → go/printer round-trip (canonical formatting) |
goimports | AST + import resolution to add/remove imports |
gopls rename | go/types object identity across the whole package graph |
golangci-lint | runs many analysis.Analyzers over a shared parse/type pass |
go fix / gofmt -r | rule-based AST rewrites |
eg (x/tools) | example-based template rewrites |
dst-based codemods | comment-preserving structural edits |
The common thread: the serious ones resolve types, not just syntax, and they emit minimal diffs. Your tool should do the same.
1. The job: codemods at scale¶
At a professional level you stop writing toy linters and start writing codemods — programs that mechanically rewrite hundreds or thousands of files: renaming an API, migrating a deprecated call, injecting context parameters, restructuring imports across a monorepo. The AST is the substrate. The hard part is almost never "find the node." It is emitting code a human will accept in review — formatting preserved, comments intact, diffs minimal — and being correct across aliased imports, shadowed names, generics, and build-tagged files.
The mental shift from earlier tiers: a junior asks "how do I match this node?"; a professional asks "how do I match it without false positives, rewrite it without losing comments, and ship it without breaking the build?" Almost everything below is about those three "withouts."
The standard toolchain:
| Tool | Role |
|---|---|
go/parser + go/ast | parse and match |
go/types + go/packages | resolve identifiers to real definitions (kill false positives) |
astutil.Apply | structural rewrite with a parent-aware cursor |
go/format / go/printer | re-emit |
golang.org/x/tools/go/analysis | the framework go vet and most linters use |
dst (decorated syntax tree) | when comment/formatting fidelity is critical |
2. Load packages properly, not files¶
A real refactoring tool must resolve types. go/packages loads a package with full type information (it drives go build/go list under the hood):
import "golang.org/x/tools/go/packages"
cfg := &packages.Config{
Mode: packages.NeedSyntax | packages.NeedTypes |
packages.NeedTypesInfo | packages.NeedName | packages.NeedFiles,
}
pkgs, err := packages.Load(cfg, "./...")
Now for each pkg you have pkg.Syntax []*ast.File and pkg.TypesInfo, which maps *ast.Ident to the types.Object it refers to. This is how you distinguish "the real fmt.Println" from "a method named Println on some local type" or "fmt shadowed by a variable." Skipping this step is the single biggest source of broken codemods.
// Is this selector REALLY net/http.Get, not some other .Get?
sel := call.Fun.(*ast.SelectorExpr)
if obj := pkg.TypesInfo.Uses[sel.Sel]; obj != nil {
if fn, ok := obj.(*types.Func); ok &&
fn.Pkg() != nil && fn.Pkg().Path() == "net/http" && fn.Name() == "Get" {
// safe to rewrite
}
}
The TypesInfo maps are the workhorses:
| Map | Maps | Use |
|---|---|---|
Defs[*ast.Ident] | the object defined by this ident | find declarations |
Uses[*ast.Ident] | the object this ident refers to | resolve references |
Types[ast.Expr] | the type & value of an expression | "is this an io.Reader?" |
Selections[*ast.SelectorExpr] | field/method selection details | distinguish field vs method |
With these you stop guessing from names and start asking the type system. This is the single dividing line between toy and production tooling.
3. Rewriting with astutil.Apply¶
astutil.Apply gives you a *astutil.Cursor that knows the node's parent and slot, so you can replace, delete, or insert safely:
astutil.Apply(file, func(c *astutil.Cursor) bool {
call, ok := c.Node().(*ast.CallExpr)
if !ok {
return true
}
if isTargetCall(call, info) {
c.Replace(buildReplacement(call))
}
return true
}, nil)
The pre-func returns false to prune. The post-func (second arg) runs after children — useful when a rewrite depends on already-transformed children. Never mutate the slice you're ranging in ast.Inspect; use Apply's cursor instead.
The cursor's full vocabulary is what makes structural edits possible:
| Method | Effect |
|---|---|
c.Node() | the current node |
c.Parent() | its parent node |
c.Name(), c.Index() | which field/slot it occupies |
c.Replace(n) | swap this node for n |
c.Delete() | remove from a slice-valued slot |
c.InsertBefore(n) / c.InsertAfter(n) | insert siblings (slice slots only) |
Delete and Insert* only work where the node lives in a slice (e.g. BlockStmt.List, GenDecl.Specs) — calling them on a scalar field panics. That constraint is exactly why naive ast.Inspect deletion is unsafe: Inspect has no parent/slot concept at all.
3.5 Matching with the inspector for speed¶
When you scan many files for a few node kinds, the golang.org/x/tools/go/ast/inspector package builds an index once and filters by type without re-walking:
import "golang.org/x/tools/go/ast/inspector"
insp := inspector.New(pkg.Syntax)
insp.WithStack([]ast.Node{(*ast.CallExpr)(nil)},
func(n ast.Node, push bool, stack []ast.Node) bool {
if !push { return true }
call := n.(*ast.CallExpr)
_ = stack // ancestors, parent-aware without astutil
return true
})
WithStack even gives you the ancestor stack, so you get parent context for read-only analysis. The analysis framework's inspect.Analyzer is exactly this, shared across all analyzers in a run — which is why a hundred linters can share one traversal.
4. The fragility of AST rewrites¶
The brutal truth: go/ast was designed for reading, not for surgical editing. Several footguns recur.
4.1 Printing re-formats everything¶
go/printer ignores your original whitespace and applies gofmt rules. If your codebase is already gofmt-clean, the only lines that change are the ones you touched — good. But if you re-print a file that had non-canonical formatting, you get a giant diff. Always run codemods on gofmt-clean input, or your "rename one function" PR rewrites 400 lines.
4.2 Comments float and get lost or misplaced¶
Comments in go/ast are not children of the nodes they describe. They live in file.Comments and are re-attached by the printer based on positions. When you move or replace a node, its position changes (or becomes NoPos), and the printer can:
- drop the comment entirely,
- attach it to the wrong node,
- duplicate it.
ast.CommentMap helps you read associations, but keeping them correct through a rewrite is genuinely hard. This is the #1 complaint about go/ast codemods.
4.3 Position-driven layout¶
The printer uses node positions to decide line breaks and blank lines. New nodes with token.NoPos collapse onto one line; nodes carrying stale positions from elsewhere produce bizarre spacing. You must either set sane positions or accept reflowing.
4.4 Synthesised nodes are easy to get subtly wrong¶
Hand-building an *ast.CallExpr is verbose and error-prone. Many teams build replacement snippets by parsing a template string instead:
parser.ParseExpr parses a single expression — handy, but note it returns a tree with positions relative to its own tiny source, which complicates grafting into a larger file.
4.5 A complete small codemod, end to end¶
Tying §2–§4 together: migrate every errors.New(fmt.Sprintf(...)) to fmt.Errorf(...) across a module. The skeleton of a production-grade run:
func main() {
cfg := &packages.Config{Mode: packages.NeedSyntax | packages.NeedTypes |
packages.NeedTypesInfo | packages.NeedName | packages.NeedFiles}
pkgs, _ := packages.Load(cfg, "./...")
for _, pkg := range pkgs {
for i, file := range pkg.Syntax {
if isGenerated(file) {
continue // skip "DO NOT EDIT" files
}
changed := false
astutil.Apply(file, func(c *astutil.Cursor) bool {
call, ok := c.Node().(*ast.CallExpr)
if !ok || !isErrorsNewOfSprintf(call, pkg.TypesInfo) {
return true
}
c.Replace(buildErrorf(call)) // errors.New(fmt.Sprintf(a,b)) → fmt.Errorf(a,b)
changed = true
return true
}, nil)
if !changed {
continue
}
path := pkg.GoFiles[i]
var buf bytes.Buffer
if err := format.Node(&buf, pkg.Fset, file); err != nil {
log.Printf("%s: print failed: %v", path, err)
continue
}
// Verify before writing.
if _, err := parser.ParseFile(token.NewFileSet(), path, buf.Bytes(), 0); err != nil {
log.Printf("%s: codemod produced invalid Go, skipping: %v", path, err)
continue
}
os.WriteFile(path, buf.Bytes(), 0o644)
}
}
}
The professional habits on display: load with types, skip generated files, only touch files you changed, print with format.Node, and re-parse before writing. The two helpers (isErrorsNewOfSprintf, buildErrorf) carry the actual logic; the harness around them is what keeps the change safe and reviewable.
5. The dst library tradeoff¶
github.com/dave/dst ("decorated syntax tree") exists precisely to fix the comment/formatting fragility. In dst, comments and spacing are attached to the nodes themselves (as "decorations"), not floated by position. Move a node and its comments move with it.
import "github.com/dave/dst/decorator"
f, _ := decorator.Parse(src) // dst.File, comments attached to nodes
// ... rewrite the dst tree; comments follow their nodes ...
decorator.Print(f) // emits, comments preserved correctly
go/ast | dst | |
|---|---|---|
| Comment fidelity through rewrites | poor (position-based) | strong (node-attached) |
| Std-lib / stable | yes | third-party |
go/types integration | first-class (TypesInfo) | needs mapping back to ast |
| API familiarity | universal | extra learning curve |
The honest tradeoff: dst makes comment-preserving structural edits dramatically easier, but you lose direct go/types integration (you map dst↔ast to get type info) and add a dependency. Rule of thumb: pure-syntactic, comment-sensitive codemods → dst; type-aware analysis or analysis.Analyzer plugins → go/ast + go/types. Many production tools parse with dst for editing fidelity and run a separate go/types pass for resolution.
In dst, decorations live on each node as Start/End/After slots, so you write comments explicitly where you want them — moving a node carries them along instead of leaving them stranded by position. That single design difference eliminates the most painful class of go/ast codemod bugs.
6. The analysis framework¶
For anything that ships (vet checks, golangci-lint plugins), use golang.org/x/tools/go/analysis. It standardises: receiving *analysis.Pass (with pass.Files, pass.TypesInfo, pass.Pkg), reporting diagnostics via pass.Reportf, declaring dependencies on other analyzers (inspect.Analyzer for a pre-built, fast AST traversal), and even suggested fixes (analysis.SuggestedFix with TextEdits) that tools can auto-apply.
var Analyzer = &analysis.Analyzer{
Name: "noprintln",
Requires: []*analysis.Analyzer{inspect.Analyzer},
Run: run,
}
Suggested fixes are text edits keyed by position, not AST mutations — sidestepping the print-reformatting problem entirely because only the targeted byte ranges change.
A minimal analyzer that reports and offers a fix:
func run(pass *analysis.Pass) (any, error) {
insp := pass.ResultOf[inspect.Analyzer].(*inspector.Inspector)
insp.Preorder([]ast.Node{(*ast.CallExpr)(nil)}, func(n ast.Node) {
call := n.(*ast.CallExpr)
sel, ok := call.Fun.(*ast.SelectorExpr)
if !ok { return }
obj := pass.TypesInfo.Uses[sel.Sel]
fn, ok := obj.(*types.Func)
if !ok || fn.Pkg() == nil || fn.Pkg().Path() != "fmt" || fn.Name() != "Println" {
return
}
pass.Report(analysis.Diagnostic{
Pos: call.Pos(),
Message: "avoid fmt.Println; use the logger",
SuggestedFixes: []analysis.SuggestedFix{{
Message: "replace fmt with log",
TextEdits: []analysis.TextEdit{{
Pos: sel.X.Pos(), End: sel.X.End(), NewText: []byte("log"),
}},
}},
})
})
return nil, nil
}
Note three professional touches: it uses the shared inspect.Analyzer for a fast indexed traversal, it resolves the call via pass.TypesInfo.Uses (not by name), and the fix is a byte-range TextEdit so the diff is exactly the four characters fmt → log.
6.5 Testing codemods and analyzers¶
A codemod without tests is a liability — it runs over your whole repo. Two established patterns:
- Golden files. Keep
testdata/in/foo.goandtestdata/out/foo.go; the test runs the codemod on the input and diffs against the expected output. Regenerate goldens with a-updateflag. This catches formatting and comment regressions, not just logic. analysistestfor analyzers.golang.org/x/tools/go/analysis/analysistestruns an analyzer againsttestdatapackages and checks// want "regexp"comments on the exact lines diagnostics should appear:
func TestAnalyzer(t *testing.T) {
dir := analysistest.TestData()
analysistest.RunWithSuggestedFixes(t, dir, Analyzer, "a")
}
RunWithSuggestedFixes also applies the analyzer's SuggestedFix edits and compares against a .golden file — so you test both the diagnostic and the fix.
Always include adversarial fixtures: aliased imports, shadowed names, generated files, build-tagged files, and the no-op (idempotency) case. Those are exactly where name-based or position-based logic breaks.
7. Real footguns checklist¶
- Run on gofmt-clean files or accept noisy diffs.
- Resolve with
go/typesbefore rewriting; never match on names alone (imports get aliased and shadowed). - Use
astutil.Apply's cursor, never mutate slices mid-Inspect. - Comments are positional in go/ast — verify them in output; consider
dstif they matter. - Generated files (
// Code generated ... DO NOT EDIT.) should usually be skipped. - Build tags / multiple files per package: load with
go/packages, handle every file, not just*.goyou happened to glob. parser.ParseExprpositions don't align with the target file — fix positions or use text-edit output.- Idempotency: running the codemod twice should be a no-op. Test it.
- Always re-parse the output to confirm it still compiles before writing to disk.
7.5 Rolling a codemod across a large repo¶
The technical rewrite is half the job; landing it without breaking everyone is the other half.
- Idempotency is non-negotiable. Running the codemod twice must be a no-op. Test it explicitly — a non-idempotent codemod produces churn on every rerun and makes rebases hell. The usual cause is matching the output form as if it were input (e.g. rewriting
fmt.Errorfagain into itself). - Split mechanical from semantic changes. Land the pure rename/migration in one PR with no behaviour change, so reviewers can trust the diff is mechanical. Bundle nothing else.
- Respect ownership and CI. A repo-wide change touches many teams' code. Generate per-package or per-owner PRs rather than one giant diff; run each package's tests.
- Handle partial failures gracefully. If file N fails to type-check or re-parse, skip it and report — don't abort the whole run or, worse, write half a file.
- Provide an escape comment. A
//nolint-style opt-out or a allowlist lets teams defer adoption without blocking the rollout. - Verify the build after, not just the parse. Re-parsing proves syntactic validity; only
go build ./...(and tests) proves the migration is semantically correct. Run it before merging.
These are the differences between a script that "works on my three test files" and a tool that safely rewrites a million-line monorepo.
7.6 When to use gofmt -r instead of writing code¶
Not every rewrite needs a Go program. gofmt -r 'pattern -> replacement' does syntactic, type-blind rewrites with wildcard placeholders:
Lowercase single-letter identifiers in the pattern are wildcards. It's perfect for purely syntactic, type-independent transforms (simplifications, mechanical API shape changes) and it preserves formatting well. Reach for a full go/ast+go/types tool only when you need type resolution, scope awareness, or logic a pattern can't express. Knowing this saves you from writing a 200-line codemod for a one-line gofmt -r.
The decision rule:
| Situation | Tool |
|---|---|
| simple syntactic pattern, no types needed | gofmt -r |
| import add/remove | goimports / astutil |
| needs type resolution or scope | go/ast + go/types + astutil.Apply |
| comment-critical structural move | dst |
| ships as a reusable check | analysis.Analyzer |
8. Summary¶
Production AST work is codemods, and the difficulty is producing review-clean output, not finding nodes. Load packages with go/packages so you can resolve identifiers via go/types and avoid name-based false positives; rewrite with astutil.Apply's parent-aware cursor; re-emit with go/format. The core fragilities are that go/printer reformats and that comments are positional and easily lost — dst solves comment fidelity at the cost of std-lib integration. For shipped checks, the analysis framework with position-keyed SuggestedFix edits gives the cleanest diffs. Run on gofmt-clean input, skip generated files, ensure idempotency, and re-parse output before writing.
Further reading¶
go/packages: https://pkg.go.dev/golang.org/x/tools/go/packagesgo/analysis: https://pkg.go.dev/golang.org/x/tools/go/analysisastutil: https://pkg.go.dev/golang.org/x/tools/go/ast/astutilgo/types: https://pkg.go.dev/go/typesdst: https://pkg.go.dev/github.com/dave/dst- Writing analyzers (passes README): https://github.com/golang/tools/blob/master/go/analysis/passes/README.md