ants — Junior Level¶
Table of Contents¶
- Introduction
- Prerequisites
- Glossary
- Why a Goroutine Pool At All
- Installing ants
- Your First Pool
- Core Concepts
- Real-World Analogies
- Mental Models
- Pros & Cons
- Use Cases
- Code Examples
- The Lifecycle of a Pool
- Submit, Capacity, and Backpressure
- Waiting for Tasks
- Clean Code
- Error Handling
- Performance Tips
- Best Practices
- Edge Cases & Pitfalls
- Common Mistakes
- Common Misconceptions
- Tricky Points
- Test
- Tricky Questions
- Cheat Sheet
- Self-Assessment Checklist
- Summary
- What You Can Build
- Further Reading
- Related Topics
- Diagrams
Introduction¶
Focus: "Why would I ever use a pool of goroutines? Aren't goroutines already cheap? How do I create one with
antsand submit work to it?"
In Go, the textbook answer to "how do I run this function concurrently?" is one line:
That line is so cheap — a goroutine starts at ~2 KB of stack and is scheduled by the Go runtime, not the OS — that for most programs you can keep going forever. A web server spins up a goroutine per request, a batch job spins up a goroutine per file, and the runtime cleans up after itself.
But "cheap" is not "free." If your program has to run millions of tasks per second, or if it has to keep a long-running fan-out under control, or if it has to enforce a strict ceiling like "we will never have more than 200 outbound HTTP requests in flight," then unbounded go f() starts to bite. The stack still has to be allocated. The g struct still has to be created. The GC still has to scan it. The scheduler still has to look at it. Memory still has to be charged against your container limit. Eventually you OOM under burst, or you hit your file-descriptor ceiling because every goroutine was holding an open connection.
A goroutine pool is the solution: a fixed (or dynamically tuned) set of long-lived worker goroutines that pull tasks from a shared queue. Instead of go f(), you call pool.Submit(f). The pool, not Go's runtime, decides where and when f runs.
ants (pronounced like the insect, but the GitHub user is panjf2000) is the most widely used goroutine pool library in the Go ecosystem. It is fast, well-maintained, idiomatic, and small enough that you can read the whole thing in an afternoon. This file is your first look at it. By the end you will:
- Understand the problem a pool solves and the cost it pays.
- Know how to import
github.com/panjf2000/ants/v2and callNewPool. - Know what
Submitdoes, whatReleasedoes, and what happens to a submitted task. - Be able to coordinate "wait for all tasks" using
sync.WaitGroup. - Recognise the most common first-time mistakes — forgetting
Release, capturing loop variables, ignoring theerrorreturn fromSubmit. - Have an intuition for "what is the right capacity?" — not yet a formula, but a feel.
You do not need to understand the internals (worker stack, lock-free path, MultiPool, options like WithExpiryDuration). All of that comes later. This file is the first 80% of what most users ever need.
Prerequisites¶
- Required: Working Go installation, version 1.18 or newer (the v2 API uses generics in some helpers; pre-generics Go technically works for the non-generic types). Check with
go version. - Required: Comfort writing
go doWork()and waiting for it withsync.WaitGroup. If "wait for ten goroutines to finish" is not yet automatic, read the01-goroutinessubsection first. - Required: Familiarity with closures:
func() { doSomething(x) }and what variables it captures. This bites every beginner who uses pools. - Required:
go mod initandgo get— you need a Go module to import the library. - Helpful: Awareness of channels and
select, sinceantsinternally uses a channel-based handoff between caller and worker. - Helpful: Any prior experience with thread pools (Java's
ExecutorService, Python'sThreadPoolExecutor, .NET'sThreadPool) — the mental model carries over.
If you can write a small program that spawns 100 goroutines, prints their index, and waits for all of them with sync.WaitGroup, you are ready.
Glossary¶
| Term | Definition |
|---|---|
| Goroutine pool | A reusable set of long-lived goroutines (workers) that pull tasks from a queue. The number of workers is bounded, so the concurrency level of your program is bounded. |
| Worker | A single long-lived goroutine inside the pool. It loops: receive task, run task, return self to the pool. Internally ants calls these goWorker. |
| Task | A func() (or, for PoolWithFunc, an interface{} argument bound to a fixed function). It is what Submit/Invoke schedules. |
| Capacity | The maximum number of workers a pool will ever have at the same time. Passed to NewPool(size). The literal upper bound on concurrent task execution. |
| Running | The current number of workers that are actively executing a task. Always <= Cap(). |
| Free | Cap() - Running(). The number of additional tasks the pool could pick up right now without queueing. |
Submit(task) | Hand a func() to the pool. Returns nil on success or ErrPoolOverload / ErrPoolClosed. The default behaviour is blocking — Submit waits until a worker is free. |
Release() | Signal that no more tasks will be submitted and tear down the pool. Workers exit, internal goroutines stop, memory is released. Idempotent in v2. |
Tune(size) | Change the pool capacity at runtime. Safe to call concurrently with Submit. |
PoolWithFunc | A specialised pool that runs the same function with different arguments. Slightly faster than Pool because the function pointer is stored once. Submitted with Invoke(arg). |
MultiPool | A sharded Pool of Pools. Used when a single pool's lock becomes the bottleneck. Covered in senior.md. |
Option | A functional-options value (ants.Option) that configures a pool. Passed as variadic to NewPool(size, opts...). |
ErrPoolOverload | Returned by Submit when the pool is full and configured WithNonblocking(true). With default blocking mode, Submit waits instead of returning this error. |
ErrPoolClosed | Returned by Submit after Release has been called. |
| Expiry / cleanup | Idle workers (workers that have been waiting for a task longer than WithExpiryDuration) are killed by a background janitor. Default: 1 second. |
Keep this table next to you for the rest of the file.
Why a Goroutine Pool At All¶
Before the API, the motivation. Two reasons people reach for ants:
Reason 1 — Bound the concurrency¶
Imagine a service that scans S3 for files and downloads each one. Naive code:
If keys has a million entries, you have spawned a million goroutines. Each one opens a TCP connection. Each one allocates ~2 KB of stack which grows on first read into ~8 KB. That is at least 8 GB of stack memory before any download data is buffered. You will be killed by the OOM killer or rate-limited by S3 before you even reach the first 100k.
A pool with capacity 200 caps this at "200 goroutines, 200 sockets, ~1.6 MB of stack." The 999_800 other tasks wait their turn. The program may be slower in wall-clock terms (depending on bandwidth), but it survives.
Reason 2 — Recycle the goroutine¶
Even if you only ever have ~10k goroutines alive, creating and destroying them every few microseconds is not free. The runtime allocates a g struct, a stack, and registers the goroutine with the scheduler. Then GC has to scan it. Under millions of short tasks per second, this becomes measurable. A pool keeps the same g alive forever and lets the worker function loop, so no per-task allocation happens.
ants benchmarks at the time of writing show ~2-6× lower memory and modestly higher throughput vs unbounded go f() for sustained short-task workloads. That is the second pitch.
When you should not reach for a pool¶
- Your program spawns at most a few hundred goroutines total — pool is pure overhead.
- Your tasks are long-running (seconds, minutes) — the goroutine creation cost is irrelevant; you should think about cancellation and observability instead.
- You actually want structured cancellation more than concurrency limits —
errgrouporgolang.org/x/sync/semaphoremay be a better fit.
A pool is a tool for high-throughput, short-task, bounded-concurrency workloads. If you do not have all three properties, you may not need a pool at all.
Installing ants¶
ants is a normal Go module. In a project with a go.mod file:
Then in your code:
Two notes that catch beginners:
- The module path must end in
/v2. If you writeimport "github.com/panjf2000/ants"you will pull the v1 API, which is missing functional options and is not what the rest of this subsection covers. - The package name inside Go code is
ants, notantsv2. Soants.NewPool,ants.Submit,ants.WithNonblocking, regardless of which version you imported.
Verify your install:
package main
import (
"fmt"
"github.com/panjf2000/ants/v2"
)
func main() {
p, err := ants.NewPool(10)
if err != nil {
panic(err)
}
defer p.Release()
fmt.Println("pool capacity:", p.Cap())
}
Running this should print:
If you instead get cannot find package, your go.mod is wrong — run go mod tidy and try again.
Your First Pool¶
The simplest useful program:
package main
import (
"fmt"
"sync"
"time"
"github.com/panjf2000/ants/v2"
)
func main() {
pool, err := ants.NewPool(5)
if err != nil {
panic(err)
}
defer pool.Release()
var wg sync.WaitGroup
for i := 1; i <= 20; i++ {
wg.Add(1)
i := i
_ = pool.Submit(func() {
defer wg.Done()
fmt.Printf("task %d running\n", i)
time.Sleep(100 * time.Millisecond)
})
}
wg.Wait()
fmt.Println("done")
}
Run it. Twenty tasks, capacity 5 — so 5 tasks run in parallel, the rest wait. After ~400 ms the program prints done.
Let's break it down carefully because every line is a teachable moment.
Line by line¶
NewPool takes a positive integer (the capacity) and returns (*Pool, error). The pool is created in a "ready" state — its background janitor goroutine starts immediately, but no worker goroutines exist yet. Workers are spawned lazily on first Submit. Capacity 5 means the pool will never exceed 5 concurrent workers.
Why does NewPool return an error? Because the size must be positive. ants.NewPool(0) returns nil, ants.ErrInvalidPoolSize. (Negative sizes are also rejected.) There is a special function ants.NewPool(-1) — yes, negative one specifically — which creates an unlimited pool (covered in middle.md). We will ignore that for now.
This is the most important line in the program. Release shuts down the pool. If you forget it, the workers, the janitor, and any goroutines blocked in Submit all leak. Even though the pool will be garbage collected eventually if nothing references it, the workers themselves are referenced by internal slices and will not be cleaned up by the GC. Always defer Release.
Submit takes a func() and queues it for execution by a worker. It returns error:
nil— the task was accepted. It may or may not have started running yet.ants.ErrPoolOverload— the pool is full and you configured non-blocking mode. With the default options,Submitblocks instead of returning this error.ants.ErrPoolClosed—Releasehas already been called.
For a junior program, ignoring the error is acceptable. For production, never ignore it — covered later.
The classic shadow. We re-declare i inside the loop body so that the closure captures a fresh variable on each iteration, not the shared loop variable. Without this, task 20 running would print twenty times (in Go ≤ 1.21). In Go 1.22+ the language was changed so the loop variable is per-iteration by default, but the explicit i := i is still recommended for clarity and for code that must compile under older versions.
Submit returns before the task finishes. Submit returning nil only means "a worker has been assigned." If we did not wait, main might exit before any task got to print. The WaitGroup is the standard cross-goroutine "wait for them all" primitive. We covered it in the goroutines section.
What does the output look like?¶
task 1 running
task 2 running
task 3 running
task 4 running
task 5 running
task 6 running
...
task 20 running
done
Order will vary — the Go scheduler can run any of the five workers in any order, and Submit to a busy pool will queue them in whatever order workers happen to become free. The only guarantees are:
- Exactly 20 lines print.
- No more than 5 are "running" at any instant.
doneis last.
Don't trust the order of printed lines. Ever. Treat goroutine output as a set, not a sequence.
Core Concepts¶
Concept 1 — The pool is a bounded concurrency primitive¶
Cap() is a contract. The pool will not exceed it. If you Submit more tasks than the pool can currently service, the rest are either:
- (default) Blocked —
Submitdoes not return until a worker is free. - (with
WithNonblocking(true)) Rejected —SubmitreturnsErrPoolOverload. - (with
WithMaxBlockingTasks(N)) Blocked, up to N callers — caller N+1 is rejected.
That's it. There is no internal "queue of pending tasks" in v2 by default. The queue, conceptually, is the goroutines that are calling Submit and waiting. This is different from Java's ExecutorService, where a separate BlockingQueue decouples submission from execution. In ants, callers are the queue.
Concept 2 — Workers live longer than tasks¶
A worker is a long-lived goroutine. When a task finishes, the worker does not exit. It returns itself to the pool's stack and waits for the next task. Only one of two things ever kills a worker:
- Idle expiry — the worker has been waiting for a task longer than
WithExpiryDuration(default 1 second). The janitor goroutine wakes it with aniltask, and the worker breaks its loop. Release— the entire pool is torn down. All workers receive a "stop" signal.
A worker is not killed simply because a task panicked. The panic handler catches it (covered in middle.md), the worker logs the panic, and then keeps going. That is intentional — if a single bad task could kill a worker, every panic would shrink the pool by one.
Concept 3 — Submit is not always immediate¶
If you call Submit and the pool is full, the calling goroutine blocks until a worker is free. This is your back-pressure. Your producer cannot outrun your consumer — at most, your producer can have one task per Goroutine waiting in Submit. If you do not want this, use WithNonblocking(true) and handle ErrPoolOverload.
This is a profound difference from go f(). go f() always returns immediately. pool.Submit(f) is conditional.
Concept 4 — The pool is goroutine-safe¶
You can call Submit from many goroutines at once. You can call Tune from one goroutine while other goroutines are calling Submit. You can call Running() and Free() at any time. The only operation that requires care is Release: after Release returns, calling any other method on the pool either returns ErrPoolClosed or panics in older versions. Do not race Release against Submit — once you decide to release, your producers should already have stopped.
Concept 5 — Workers run on the same OS threads as everything else¶
A pool is not a thread pool. It is a goroutine pool. The workers are still scheduled by Go's GMP scheduler onto the same GOMAXPROCS OS threads as every other goroutine in your program. You are not getting CPU affinity. You are not getting kernel-level isolation. What you are getting is:
- A predictable upper bound on goroutine count for this work.
- A
sync.Poolcache ofgoWorkerstructs, so the per-task allocation cost is near zero. - Bounded scheduler queue depth — far fewer
gs in the run queues at any moment.
Real-World Analogies¶
The barber shop¶
Imagine a barbershop with five chairs and five barbers. Customers arrive and either:
- Sit down immediately if a chair is free (worker is idle).
- Wait in line if all chairs are taken (caller blocks in
Submit).
When a customer leaves, the next person in line takes the chair. The barbers don't go home between customers — they wait in the chair. That's the worker. The chair is the slot. The line is the queue of blocked submitters.
When the shop closes (Release), the barbers go home, the line is told there's no more service, and the lights go out.
This is exactly the model ants implements, minus the haircut.
The conveyor belt¶
Items arrive on a conveyor belt one by one. There are a fixed number of operators (workers). Each operator picks up one item, processes it, puts it down, and reaches for the next. If items arrive faster than they can be processed, the belt fills up behind them — that's the queue of blocked submitters.
You can speed up the line in two ways:
- Add more operators (
Tunethe pool up). - Make each item faster to process (optimise the task).
You cannot make items appear faster than the operators can handle them without the belt overflowing. That's ErrPoolOverload in non-blocking mode.
The thread pool you already know¶
If you've used Java's ExecutorService.newFixedThreadPool(5) or Python's concurrent.futures.ThreadPoolExecutor(max_workers=5), the mental model is identical — minus the threads. Replace "thread" with "goroutine" and you're done. The main API differences:
- Java has explicit
Futurereturn values.antsdoes not — you bring your own (channel,WaitGroup, etc.). - Java has a separate
BlockingQueuefor tasks.antshas the submitters themselves as the queue. - Java's
shutdownNow()interrupts in-flight tasks.ants.Releasedoes not — running tasks complete normally; only the workers' next loop iteration sees the signal.
Mental Models¶
Model 1 — Pool = capped semaphore + recyclable executor¶
Conceptually, Submit is:
This is literally what a counting semaphore + worker would do. ants is a more efficient implementation of that abstract pattern.
Model 2 — Pool = LIFO stack of idle workers¶
Internally ants.Pool is a stack (LIFO) of idle workers plus a counter of "running" workers. Submit:
- Increment running counter (or block if at capacity).
- Pop a worker from the idle stack. If the stack is empty, allocate a new one (up to capacity).
- Send the task to the worker via its private channel.
When the task finishes:
- The worker puts itself back on the idle stack.
- Decrement running counter.
- If a blocked caller is waiting, wake them.
We will see the actual code in senior.md. For now, the picture is enough.
Model 3 — The pool is the rate limiter¶
If your pool capacity is N, your effective task-execution rate is min(N / avg_task_duration, producer_rate). The pool forces this. You do not have to add a rate limiter on top. Conversely, if you need a different rate (e.g. "10 ops/sec, but with N workers"), the pool is not your rate limiter — you need golang.org/x/time/rate.
Pros & Cons¶
Pros¶
- Cap on concurrency. No more "infinite goroutines" surprises. Easy to reason about peak memory.
- Recycled workers. Per-task overhead drops to "send on channel + run closure." No
gstruct allocation, no scheduler enqueue/dequeue churn. - Built-in back-pressure.
Submitblocks, so a slow consumer naturally slows down a fast producer. No bespoke channel plumbing needed. - Mature. 13k+ stars, used in production, fuzzed, benchmarked, released regularly.
- Small. The entire library is ~1500 lines of Go. You can read it.
Cons¶
- It is not free. You add a dependency and ~1.5 KB of stack per idle worker.
- It introduces a fairness consideration. If your tasks vary wildly in cost, a long task blocks a worker indefinitely.
Submitis not cancellation-aware. There is noctx context.Contextparameter. If you want cancellation, you bring your ownselect { case <-ctx.Done(): }inside the task.- It is not a queue. Tasks do not "stack up" anywhere visible. If you need a queue with introspection ("how many tasks pending?"), you must build one in front of the pool.
- It is not a scheduler. Tasks run in arbitrary order. No priorities, no deadlines.
Use Cases¶
A pool shines when:
- Web/RPC server fan-out. One request triggers N parallel downstream calls. Without a pool, a slow client can DoS your server. With a pool of, say, 1000 downstream workers, you are guaranteed never to exceed 1000 in-flight outbound calls regardless of request rate.
- Batch processing. Crawling files, transforming images, scanning logs. The total work is huge; you want a bounded sliding window.
- Connection pooling for non-pooled libraries. If you talk to a service that doesn't have its own connection pool, you can wrap each call in a
Submitand use the pool as your concurrency cap. - Event bus consumers. A pool per topic, sized to that topic's tolerable concurrency, gives you topic-level isolation.
A pool is overkill when:
- You have a fixed, small number of long-running tasks. Use plain goroutines.
- Your concurrency is naturally bounded by something else (e.g. a
sync.Mutexalready serialises the work). - You need unordered but prioritised execution. Pools don't prioritise — they FIFO over the worker stack with no notion of cost.
Code Examples¶
Example 1 — Bounded fan-out¶
Run 1000 jobs, but only 50 at a time.
package main
import (
"fmt"
"sync"
"time"
"github.com/panjf2000/ants/v2"
)
func main() {
pool, _ := ants.NewPool(50)
defer pool.Release()
var wg sync.WaitGroup
start := time.Now()
for i := 0; i < 1000; i++ {
wg.Add(1)
i := i
_ = pool.Submit(func() {
defer wg.Done()
time.Sleep(10 * time.Millisecond)
_ = i
})
}
wg.Wait()
fmt.Println("elapsed:", time.Since(start))
}
Each task sleeps 10 ms. With 50 workers and 1000 tasks, total time should be roughly 1000/50 * 10ms = 200 ms. Add a little overhead and you'll see ~210–220 ms. If you remove the pool and just go func(), total time will be closer to 10 ms — but you'll have 1000 simultaneous goroutines, which is fine for this trivial example but disastrous at scale.
Example 2 — Pool stats¶
You can ask the pool how it's doing.
package main
import (
"fmt"
"sync"
"time"
"github.com/panjf2000/ants/v2"
)
func main() {
pool, _ := ants.NewPool(10)
defer pool.Release()
go func() {
ticker := time.NewTicker(50 * time.Millisecond)
defer ticker.Stop()
for range ticker.C {
fmt.Printf("running=%d free=%d cap=%d\n",
pool.Running(), pool.Free(), pool.Cap())
}
}()
var wg sync.WaitGroup
for i := 0; i < 50; i++ {
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
time.Sleep(200 * time.Millisecond)
})
}
wg.Wait()
}
Output rhythms like:
running=10 free=0 cap=10
running=10 free=0 cap=10
running=10 free=0 cap=10
running=10 free=0 cap=10
running=10 free=0 cap=10
running=0 free=10 cap=10
For ~1 second you see 10 of 10 running, then the last batch drains and running drops. This is the simplest observability you get for free.
Example 3 — Submitting from many producers¶
Submit is goroutine-safe. You can have many producers.
package main
import (
"fmt"
"sync"
"sync/atomic"
"github.com/panjf2000/ants/v2"
)
func main() {
pool, _ := ants.NewPool(100)
defer pool.Release()
var done int64
var wg sync.WaitGroup
// 8 producers, each submitting 1000 tasks
for p := 0; p < 8; p++ {
wg.Add(1)
go func() {
defer wg.Done()
for j := 0; j < 1000; j++ {
_ = pool.Submit(func() {
atomic.AddInt64(&done, 1)
})
}
}()
}
wg.Wait()
// All submits returned; some tasks may still be running.
// Drain by polling Running.
for pool.Running() > 0 {
}
fmt.Println("done:", atomic.LoadInt64(&done))
}
Output:
A few notes:
- We use
atomic.AddInt64becausedoneis touched from many workers. Reaching forsync.Mutexis also fine. Reaching for nakeddone++is a data race. wg.Wait()only waits for producers, not for tasks. After it returns, tasks are still draining. Adding the producer'swg.Doneto the task itself would couple producer and consumer; the cleaner pattern is a secondWaitGroupfor tasks (shown in Example 1).- The busy-loop
for pool.Running() > 0 {}is a code smell. It works for a demo. In real code, count tasks with aWaitGroupinstead.
Example 4 — Forgetting Release and what happens¶
Here is the bug we will see ten times in our lives:
package main
import (
"fmt"
"runtime"
"sync"
"time"
"github.com/panjf2000/ants/v2"
)
func main() {
for i := 0; i < 5; i++ {
pool, _ := ants.NewPool(100)
// no defer pool.Release()
var wg sync.WaitGroup
for j := 0; j < 100; j++ {
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
time.Sleep(time.Millisecond)
})
}
wg.Wait()
}
time.Sleep(2 * time.Second)
fmt.Println("goroutines after 5 unreleased pools:", runtime.NumGoroutine())
}
Each pool spawned ~100 workers plus a janitor. None of them were released. After 5 iterations you have 5 * (100 + 1) + main + 1 (for runtime) ≈ 510 goroutines lingering. They will idle for WithExpiryDuration (default 1 second), at which point workers exit themselves — but the janitor is still there. And in some older versions of ants, the workers also stay because the janitor needs to be released first.
The takeaway is unambiguous: defer pool.Release() immediately after NewPool. No exceptions.
Example 5 — A small worker farm with results¶
So far our tasks have only had side effects. To get results back, you need a channel:
package main
import (
"fmt"
"sync"
"github.com/panjf2000/ants/v2"
)
type job struct {
in int
out int
}
func main() {
pool, _ := ants.NewPool(8)
defer pool.Release()
jobs := make([]*job, 100)
for i := range jobs {
jobs[i] = &job{in: i}
}
var wg sync.WaitGroup
for _, j := range jobs {
j := j
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
j.out = j.in * j.in
})
}
wg.Wait()
sum := 0
for _, j := range jobs {
sum += j.out
}
fmt.Println("sum of squares 0..99:", sum)
}
Output:
Each task mutates a private field on a private struct. No locks needed because each worker has exclusive ownership of its *job. After wg.Wait() the main goroutine has the data race-free view (the WaitGroup provides the happens-before).
Example 6 — Returning errors from tasks¶
A task is func() — no error return. The convention is to capture errors into a slice (per-index), or into a channel, or into an errgroup-like wrapper (shown in professional.md).
package main
import (
"errors"
"fmt"
"sync"
"github.com/panjf2000/ants/v2"
)
func main() {
pool, _ := ants.NewPool(4)
defer pool.Release()
type result struct {
i int
err error
}
results := make([]result, 10)
var wg sync.WaitGroup
for i := 0; i < 10; i++ {
i := i
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
if i%3 == 0 {
results[i] = result{i, errors.New("divisible by 3")}
return
}
results[i] = result{i, nil}
})
}
wg.Wait()
for _, r := range results {
fmt.Printf("i=%d err=%v\n", r.i, r.err)
}
}
Each task writes to a unique slot. No locking required, no race. This is the simplest pattern that produces per-task errors.
Example 7 — Submitting the same closure many times¶
A subtle gotcha. This:
…prints 10 ten times in Go ≤ 1.21 because all closures share the same i variable. In Go 1.22+ it prints 0..9 in some order. Either way, be explicit:
This is exactly the captured-loop-variable bug from 01-goroutines/junior.md. It does not go away with pools — the closure is still the closure.
The Lifecycle of a Pool¶
The mental movie:
+-----------+
NewPool(n) -----> | created |
+-----------+
|
| first Submit
v
+-----------+ Submit (room available)
| active |<-------------------------------+
+-----------+ |
| |
| Submit (room full, blocking mode) |
v |
+-----------+ worker frees up |
| caller |-------------------------------->
| blocked |
+-----------+
|
| Release()
v
+-----------+
| closed | Submit -> ErrPoolClosed
+-----------+ Release is no-op (idempotent in v2)
Three states: created, active, closed. There is no "draining" state — Release does not wait for in-flight tasks. If you need that, use a WaitGroup.
What Release actually does¶
In v2, Release performs roughly the following:
- Set an internal
closedflag withatomic.CompareAndSwap. If the flag was already set, return immediately (idempotency). - Iterate the idle worker stack. For each worker, send a
niltask. The worker's loop interpretsnilas "stop." - Wake any callers blocked in
Submit. They observe the closed flag and returnErrPoolClosed. - Signal the janitor goroutine to stop and
<-its exit channel.
Note what it does not do:
- It does not wait for running tasks to finish. A worker that is mid-task will complete its task, then check the closed flag on its next loop iteration, then exit. Your task is not interrupted.
- It does not cancel a
context.Contextpassed to the task — there is no such context. - It does not nil out the pool variable. You should not call
SubmitafterRelease, but the pool object stays valid until the GC collects it.
ReleaseTimeout (v2.7+)¶
Some versions of ants expose ReleaseTimeout(d time.Duration) error. This is Release plus a wait for in-flight tasks, with a timeout. Useful for graceful shutdown. Covered in middle.md and professional.md.
Submit, Capacity, and Backpressure¶
What capacity actually means¶
NewPool(N) says "no more than N workers will ever exist at the same time." It does not say:
- "I will pre-allocate N workers." Workers are spawned on demand, up to N.
- "I will queue at most N pending tasks." There is no internal task queue. Pending tasks are the callers blocked in Submit.
- "I will use N OS threads." OS threads come from Go's
GOMAXPROCS, unrelated.
What happens at the boundary¶
Suppose Cap=5. Five tasks come in, all Submit return immediately, five workers start. A sixth task arrives:
- Default (blocking).
Submit(sixth)blocks. The caller's goroutine sleeps insideSubmituntil one of the five workers finishes its current task. ThenSubmitreturns nil. WithNonblocking(true).Submit(sixth)returnsErrPoolOverload. The caller decides what to do — drop, retry, log.WithMaxBlockingTasks(M). Up to M callers may block. Caller M+1 getsErrPoolOverload. Default M is 0, which means unlimited blocking callers in blocking mode.
This is the entire surface of pool backpressure. There are no priority queues, no work-stealing across pools, no dropping policies. If you need something more elaborate, you build it on top.
What is Free?¶
Free() == Cap() - Running(). It is the number of additional tasks the pool could pick up right now without anybody blocking. It is not "how much room in the queue" — there is no queue.
When does Submit return?¶
Submit returns when:
- A worker has accepted the task and the task's
func()has been sent to the worker's input channel. The task may or may not have started running yet — the worker might still be finishing its previous task, with this one queued in its (one-deep) channel. - The pool is closed (returns
ErrPoolClosed). - The pool is full and the caller is configured to be rejected (returns
ErrPoolOverload).
In particular, Submit does not wait for the task to finish. That is the caller's job.
Waiting for Tasks¶
The canonical pattern is sync.WaitGroup:
var wg sync.WaitGroup
for _, task := range tasks {
wg.Add(1)
task := task
_ = pool.Submit(func() {
defer wg.Done()
runTask(task)
})
}
wg.Wait()
Things to remember:
wg.Addmust happen beforeSubmit, not inside the closure. IfAddis inside the closure,wg.Waitmay run before anyAddhappens and return immediately.wg.Doneshould bedefer-red. Even if the task panics, the panic handler will recover, butDonewould not run withoutdefer, and you would deadlock.- Do not call
wg.Waitfrom inside a task. That would deadlock if the pool is full of tasks all waiting for each other.
Alternative: channel of results¶
If you also need to collect results, replace the WaitGroup with a channel:
results := make(chan int, len(tasks))
for _, t := range tasks {
t := t
_ = pool.Submit(func() {
results <- compute(t)
})
}
for i := 0; i < len(tasks); i++ {
r := <-results
_ = r
}
close(results)
You don't need a WaitGroup here because counting the receives is equivalent to counting the sends.
Alternative: errgroup¶
golang.org/x/sync/errgroup does both — it counts tasks and propagates the first error. Combining it with ants is shown in professional.md.
Clean Code¶
Always defer Release¶
If you need a pool that outlives a function (e.g. attached to a Server), Release belongs in the Server.Close() method, not in New. Either way, somebody defers it.
Always check NewPool's error¶
NewPool only fails when you pass an invalid size. If your size is a compile-time positive constant, the error is impossible — but checking it costs nothing and protects you when the size becomes a variable.
pool, err := ants.NewPool(cfg.PoolSize)
if err != nil {
return fmt.Errorf("invalid pool size %d: %w", cfg.PoolSize, err)
}
Always handle Submit's error¶
The two errors Submit can return are real:
ErrPoolOverload— only in non-blocking mode, but if you forget to handle it, your task silently never runs.ErrPoolClosed— happens during shutdown. Forgetting to handle it means tasks silently disappear during shutdown.
Pattern:
if err := pool.Submit(task); err != nil {
// Decide: log, retry, fall back to inline execution
log.Printf("submit failed: %v", err)
}
Name your pools¶
If you have more than one pool in a program, give them context. A struct field or a package-level variable with a clear name is better than globalPool.
That way error messages and logs are self-explaining.
Don't expose the pool¶
The pool is an implementation detail. Wrap it.
type Worker struct {
pool *ants.Pool
}
func (w *Worker) Submit(task func()) error {
return w.pool.Submit(task)
}
This lets you swap ants for a different implementation, or add observability, without rewriting callers.
Error Handling¶
ants errors are simple sentinel values, exported from the package:
ants.ErrInvalidPoolSize— returned byNewPool(0)or negative size other than-1.ants.ErrInvalidPoolExpiry— returned byNewPool(N, WithExpiryDuration(0)).ants.ErrPoolOverload— returned bySubmit/Invokewhen full and non-blocking or whenMaxBlockingTasksis hit.ants.ErrPoolClosed— returned bySubmit/InvokeafterRelease.
Compare with errors.Is for safety in newer versions:
if err := pool.Submit(task); err != nil {
if errors.Is(err, ants.ErrPoolClosed) {
return // shutting down
}
if errors.Is(err, ants.ErrPoolOverload) {
// fall back: run inline, drop, retry
}
}
In v2 they are all error-typed package-level vars, so == comparison also works.
Panics inside tasks¶
By default, a panic inside a task is caught by the worker, logged with log.Printf to stderr, and the worker continues running. This is good news for stability but bad news for visibility — your panic is buried in stderr. The WithPanicHandler option lets you install a callback. Covered in middle.md.
For now, the only thing to remember: a panic in a task does not crash your program. That is different from go f(), which crashes on panic.
Performance Tips¶
These are juniors' tips. Senior performance is covered later.
Tip 1 — Size the pool for the bottleneck¶
If your task is "make an HTTP call to a backend that can handle 100 concurrent requests," size your pool at ~100. Beyond that, you'll just see more queueing time and no extra throughput. Sometimes 1.2× is healthy to absorb jitter.
Tip 2 — Don't size by CPU count for I/O work¶
CPU count is the right answer for CPU-bound tasks. For I/O-bound tasks, it's irrelevant. You want enough goroutines to keep the I/O channel saturated. That can mean hundreds or thousands.
Tip 3 — One pool per workload, not one giant pool¶
If you have two workloads — say "image resize" (CPU-bound) and "S3 upload" (I/O-bound) — they have very different correct sizes. Two pools is fine. Sharing one pool means the slow workload starves the fast one.
Tip 4 — Don't measure throughput with time.Sleep¶
It's tempting to write:
…and call that "1000 tasks/sec." It isn't a benchmark. time.Sleep doesn't measure pool overhead — it measures the OS timer. Use time.Now/time.Since around real work to learn anything useful.
Tip 5 — Pre-warm if you know you'll burst¶
If you expect a sudden burst, you can pre-fill the pool by submitting trivial tasks until Running() reaches Cap(). The workers stay alive (within the expiry window) and the burst doesn't pay the worker-creation cost. This is rarely needed but worth knowing.
Best Practices¶
A short, opinionated list:
- Defer
Releasenext toNewPool. Not in some other file. Not in some other function. Right there. - Always check
Submit's error, even if you think you're in blocking mode. - Don't share a pool across unrelated workloads. One pool per logical concern.
- Don't put your pool in package-level globals unless it really is global. Tests get harder.
- Don't block inside a task on a channel that nobody else will write to. You'll deadlock the worker.
- Don't call
Submitfrom inside a task if the pool might be full — you can deadlock. - Don't expect order. Tasks run in whatever order workers happen to pick them up.
- Log panics. Install a
WithPanicHandler. The default silent log is not enough in production. - Document your pool's size in comments. "Why 200?" should have an answer in the code.
- Re-tune for production. The size that's perfect on your laptop is almost never right in production.
Edge Cases & Pitfalls¶
Pitfall 1 — Capacity of 0¶
ants.NewPool(0) returns nil, ErrInvalidPoolSize. Always check the error.
Pitfall 2 — Capacity of -1 = unlimited¶
ants.NewPool(-1) is a valid call and creates a pool with no upper bound. It will spawn as many workers as you submit tasks. This is rarely what you want — it's effectively go f() with a slight overhead. Use a real positive size.
Pitfall 3 — Tasks submitted after Release silently disappear, only if you ignore the error¶
If Release runs first, Submit returns ErrPoolClosed. If you ignore the error, your task vanishes. Always check.
Pitfall 4 — A task panic does not propagate¶
The pool catches the panic. The caller of Submit does not see it. The WaitGroup's Done will not run unless you defer it. If you forgot the defer, your wg.Wait will hang forever. Always defer wg.Done().
Pitfall 5 — Capacity of 1 makes the pool a serializer¶
NewPool(1) is a valid construct — every task runs strictly after the previous one. It's a glorified sync.Mutex. Don't write it by accident; if you really want a serializer, use a sync.Mutex or a single-buffer channel.
Pitfall 6 — Holding a reference to the pool keeps it alive¶
Even after Release, if some other goroutine still has a reference and tries Submit, you get ErrPoolClosed — but the pool struct itself stays in memory until everyone drops the reference. Don't leak references in long-running structs.
Pitfall 7 — Submit's blocking is unbounded by default¶
If your producers run forever and your consumer is slow, you can have arbitrarily many goroutines blocked in Submit. Each one is a goroutine. If you really need to cap "callers blocked in submit," use WithMaxBlockingTasks(N).
Pitfall 8 — The pool is not safe to copy¶
*ants.Pool is a pointer. Pass it, don't copy *pool by value. The internal mutex and slices break.
Pitfall 9 — runtime.GOMAXPROCS still bounds you¶
Even with a pool of 10000, if GOMAXPROCS=2, only 2 CPU-bound goroutines run at once. The other 9998 will be scheduled in. For CPU-bound work, sizing past GOMAXPROCS * k (with k around 1–2) is wasted memory.
Pitfall 10 — Pool capacity != throughput¶
A pool of 100 doesn't mean "100 tasks/sec." It means "up to 100 concurrent in-flight tasks." Throughput is capacity / avg_duration. If you don't know avg_duration, you don't know throughput. Measure.
Common Mistakes¶
These are the most common errors in junior ants code reviews.
Mistake 1 — No defer Release¶
Fix: defer pool.Release() directly after NewPool.
Mistake 2 — Captured loop variable¶
Fix: i := i inside the loop, or upgrade to Go 1.22+.
Mistake 3 — wg.Done not deferred¶
If doWork panics, Done does not run, and Wait hangs forever. Use defer wg.Done().
Mistake 4 — wg.Add inside the goroutine¶
for ... {
_ = pool.Submit(func() {
wg.Add(1)
defer wg.Done()
...
})
}
wg.Wait() // may return immediately
Add happens after the calling goroutine returns to the loop, so Wait might see zero and exit. Always wg.Add(1) before Submit.
Mistake 5 — Ignoring Submit's error¶
If the pool is closed during shutdown, your task is silently dropped. Always check.
Mistake 6 — Calling Release inside the task¶
Release signals all workers and waits. The worker calling Release is one of the workers waiting to be signalled. Deadlock.
Mistake 7 — Using time.Sleep for synchronisation¶
Use a channel or a WaitGroup. Time-based synchronisation is wrong even when it appears to work.
Mistake 8 — Submitting non-closures¶
_ = pool.Submit(doWork) // works only if doWork's signature is func()
_ = pool.Submit(doWork(1, 2)) // wrong — this is calling doWork right now
Submit takes func(). Bind your arguments via a closure: _ = pool.Submit(func() { doWork(1, 2) }).
Mistake 9 — Treating Submit as fire-and-forget¶
Submit returning means "accepted." It doesn't mean "started" or "finished." If you main() exits right after, the task may never run.
Mistake 10 — One huge pool for everything¶
A single global pool for HTTP requests, file I/O, and CPU work means each workload can starve the others. Three smaller pools, sized per workload, are almost always better.
Common Misconceptions¶
Misconception 1 — "It's faster than go f() for everything."¶
It is faster for sustained, high-rate, short-duration tasks. For a single one-off task, plain go f() is faster and simpler. Benchmark before you adopt.
Misconception 2 — "It guarantees task order."¶
No. Submitted tasks are picked up by whichever worker is free, in whatever order Go's scheduler decides.
Misconception 3 — "It pre-allocates workers."¶
No. Workers are created on demand. A freshly created NewPool(1000) has zero workers, one janitor goroutine, and no allocated stacks for tasks.
Misconception 4 — "It cancels tasks on Release."¶
No. Release only signals idle workers to exit. In-flight tasks complete normally. If you need cancellation, plumb a context.Context into each task yourself.
Misconception 5 — "Submit always returns immediately."¶
Only in non-blocking mode. In default blocking mode, Submit waits until a worker is free.
Misconception 6 — "A panic kills the pool."¶
It does not. The default goWorker has a recover() and continues. Only the panic's task dies.
Misconception 7 — "The pool gives me a Future."¶
It does not. Submit returns error, not a result. You bring your own result-collection mechanism (channel, slice, WaitGroup).
Misconception 8 — "I should make the pool as big as possible."¶
A pool the size of a billion is identical to no pool at all. The point is the bound. Bigger pool == weaker bound. Choose a bound that protects what you care about (memory, sockets, downstream rate).
Tricky Points¶
Tricky 1 — Capacity changes are immediate but tasks aren't¶
Tune(N) updates the capacity now, but already-running tasks don't stop. If you shrink the pool from 100 to 10, you still have 100 running tasks until they each finish. New submits block until Running <= 10.
Tricky 2 — Free can be negative briefly¶
In some older versions there's a tiny race window where Running > Cap (immediately after Tune down). The bug has been fixed; just don't rely on Free >= 0 strictly. Use if pool.Free() > 0 carefully.
Tricky 3 — A goroutine blocked in Submit is real¶
It is a real g struct, with stack, scheduled by the runtime. Many blocked submitters consume memory. If your worry was "I want to bound goroutines," and you blocked-submit a million tasks, you have a million goroutines blocked in Submit plus N workers. That's not what you wanted.
Tricky 4 — Pool inside pool¶
You can Submit a task that itself Submits another task to the same pool. This is fine if the inner submit eventually succeeds without blocking. But if the outer task fills the pool with outer tasks first, every inner Submit blocks behind itself — classic deadlock. Inner submits to the same pool are dangerous. Use a different pool, or Submit to the same pool only from outside any of its tasks.
Tricky 5 — Workers can run on any OS thread¶
A worker is a goroutine, not a thread. The scheduler may move it between threads. If your task uses OS-specific TLS (thread-local storage), you can't rely on it staying on one OS thread. Use runtime.LockOSThread inside the task if needed — covered in senior.md.
Test¶
Q1¶
What does ants.NewPool(0) return?
A. It returns nil, ants.ErrInvalidPoolSize. The error explains that the size must be positive (or exactly -1 for unlimited).
Q2¶
What happens if you Submit to a full pool in default (blocking) mode?
A. The calling goroutine blocks inside Submit until a worker becomes free. Submit does not return until then.
Q3¶
What does defer pool.Release() actually do?
A. It schedules Release to be called when the enclosing function returns. Release signals all idle workers to stop, sets the closed flag, and stops the janitor goroutine. In-flight tasks are not interrupted.
Q4¶
You write for i := 0; i < 10; i++ { _ = pool.Submit(func() { fmt.Println(i) }) } in Go 1.20. What might it print?
A. Up to ten copies of 10. In Go 1.20 the loop variable i is shared by all closures, and by the time any worker runs the closure, i has reached 10. Fix with i := i inside the loop.
Q5¶
What is the difference between Cap, Running, and Free?
A. Cap is the upper limit. Running is how many workers are currently executing tasks. Free is Cap - Running, the count of workers that could pick up a task right now without blocking. Pending blocked submitters do not show up in Running.
Q6¶
Is Submit thread-safe?
A. Yes. You can call Submit from many goroutines concurrently. The pool's internal data is protected by mutex / atomic operations.
Q7¶
What happens if a task panics?
A. The pool's goWorker has a recover() in its loop. The panic is caught, the worker continues. By default the panic is logged with log.Printf. You can install a custom handler with WithPanicHandler.
Q8¶
What is ants.PoolWithFunc?
A. A specialised pool that runs the same function with different arguments. You provide the function once at creation time and submit only the arguments via Invoke(arg). Covered in middle.md.
Q9¶
After pool.Release(), what does pool.Submit(t) return?
A. ants.ErrPoolClosed. Always check the error from Submit.
Q10¶
How do you wait for all submitted tasks to finish?
A. Use a sync.WaitGroup: Add(1) before Submit, defer wg.Done() inside the task closure, wg.Wait() after submitting all.
Q11¶
Why is wg.Add(1) outside the closure, not inside?
A. Submit returns before the closure runs. If wg.Add(1) were inside the closure, wg.Wait() could run before any Add completed and observe a counter of zero, returning immediately.
Q12¶
What is the right pool size for HTTP downloads to a server that allows at most 50 concurrent requests?
A. Around 50. A little less is safer (40); more does not help and may exceed the server's limit.
Q13¶
Does Submit block forever if the pool is closed mid-block?
A. No. Release wakes all blocked submitters, and they return ErrPoolClosed.
Q14¶
What is runtime.NumGoroutine likely to show right after NewPool(100) and before any Submit?
A. A small number — your main, the runtime, plus the pool's janitor goroutine (one). The 100 workers are not pre-allocated.
Q15¶
Is the pool safe to use after defer pool.Release() in main()?
A. Yes, until main returns. The defer fires only on return. While the function is alive, the pool is fully usable.
Tricky Questions¶
TQ1¶
Q. I have ants.NewPool(10). I Submit 100 tasks in a tight loop. How many of my goroutines are alive at the peak?
A. It depends. If each task is instant, you may never have all 10 workers busy at once and Submit never blocks; you have 10 + the goroutine running the loop + janitor = ~12. If each task is slow, you have 10 workers + N blocked submitters + janitor + main, possibly ~100. The exact peak depends on task duration vs submit speed.
TQ2¶
Q. I created two pools: poolA of cap 5, poolB of cap 5. Each task in poolA does poolB.Submit. Each task in poolB does poolA.Submit. What happens?
A. Possible deadlock. If poolA fills up with tasks that are all trying to submit to poolB, and poolB fills up with tasks that are all trying to submit to poolA, no worker can finish and no worker can take new work. The general rule: do not have cyclic submit dependencies between pools.
TQ3¶
Q. Why does my pool of capacity 100 use barely any CPU even though I'm submitting 100 CPU-bound tasks per second?
A. Almost certainly GOMAXPROCS is small (e.g. 1 or 2 in a constrained container). Your tasks compete for the OS threads, not for pool slots. Check runtime.GOMAXPROCS(0).
TQ4¶
Q. Why does pool.Free() sometimes show a value greater than the number of tasks I'm sure are queued?
A. Because pending tasks are not queued inside the pool — they are queued as blocked Submit callers. Free measures only the workers, not the queue.
TQ5¶
Q. I expected Submit to be lock-free. Why does runtime/pprof show contention on it?
A. In v2 there is a lock-free fast path (when a worker is available on the idle stack), but the slow path (pool is full, allocate a new worker, or wait) uses a mutex. Under contention you'll see the mutex in profiles. Covered in senior.md.
TQ6¶
Q. Can I Submit from inside a task to the same pool, never deadlocking?
A. Only if you can guarantee the pool will not be full at the time of the inner submit. In general, do not do this. Use a different pool or a different mechanism.
TQ7¶
Q. Why does my program leak goroutines after I Release the pool?
A. Likely you have callers blocked in Submit that raced Release. Or you forgot that Release does not interrupt running tasks, and one task is itself blocked forever (on a closed channel without a default, on a nil channel, etc.). Inspect goroutine dumps.
TQ8¶
Q. Should I be using ants.NewPool(-1) for max throughput?
A. Almost never. -1 means unlimited workers — at that point you've thrown away the pool's main benefit (bounded concurrency). If you want unlimited, use go f() directly.
TQ9¶
Q. I call Tune(5) on a pool with 50 running tasks. What happens immediately?
A. The cap is updated to 5. Running tasks continue to completion. Once they finish, no new workers are spawned beyond 5. New submits block until Running <= 5. If a new submit arrives while Running > 5, it queues up (blocks). This is intentional — the cap is a future bound, not a retroactive cancellation.
TQ10¶
Q. Is defer pool.Release() enough to prevent leaks?
A. It prevents pool leaks. It does not prevent task leaks — if your tasks block forever on something other than the pool, they will hang. Release does not interrupt them. You also need to ensure your tasks themselves are cancellable, usually with a context.Context.
Cheat Sheet¶
// Create a pool of size N.
pool, err := ants.NewPool(N)
if err != nil { /* size invalid */ }
defer pool.Release()
// Submit work. Blocks if the pool is full (default behaviour).
err = pool.Submit(func() {
// do something
})
// Inspect.
pool.Cap() // upper bound on workers
pool.Running() // active workers
pool.Free() // Cap - Running
pool.IsClosed() // true after Release
// Resize.
pool.Tune(newSize)
// Wait for all submitted tasks (you bring this).
var wg sync.WaitGroup
wg.Add(1)
_ = pool.Submit(func() { defer wg.Done(); doIt() })
wg.Wait()
// Errors.
errors.Is(err, ants.ErrPoolClosed)
errors.Is(err, ants.ErrPoolOverload)
errors.Is(err, ants.ErrInvalidPoolSize)
Self-Assessment Checklist¶
You are done with this file when you can:
- Explain why a goroutine pool exists, in two sentences, without reading.
- Write a program that creates a pool of size 10, submits 100 tasks, waits for all of them, and prints
Done. - Explain what happens when you
Submitto a full pool (both blocking and non-blocking modes). - Identify the captured-loop-variable bug in a 5-line snippet.
- Decide whether a given workload should use a pool or plain goroutines, and defend the choice.
- Read the source of
Pool.SubmitandgoWorker.runwithout panicking. (Skim only — internals are insenior.md.) - Distinguish between
Cap,Running, andFree. - State the difference between
Releaseand "wait for in-flight tasks." - List five common mistakes from this file from memory.
Summary¶
ants is a goroutine pool — a fixed-size set of long-lived worker goroutines that you submit work to. The API surface for junior usage is tiny:
ants.NewPool(N)to create.pool.Submit(func())to enqueue.defer pool.Release()to clean up.pool.Running(),pool.Free(),pool.Cap()to introspect.
Use a pool when you have many short tasks and you need to cap concurrency. Don't use a pool when you have a handful of long tasks or when the bound doesn't matter — plain goroutines are simpler and sometimes faster. Always defer Release. Always check Submit's error. Always shadow the loop variable.
The internals — worker stack, lock-free path, expiry janitor, MultiPool sharding — are covered in senior.md. The richer API — PoolWithFunc, functional options, panic handler — is in middle.md. Production patterns are in professional.md.
What You Can Build¶
After this file, you should be able to build:
- A bounded URL fetcher: submit one task per URL, pool of size 50, store results in a slice.
- A directory scanner: walk a tree, submit one task per file (e.g. compute MD5), aggregate results.
- A bulk inserter: read CSV rows, submit one task per row with rate-limited DB inserts.
- A simple worker farm with a stats endpoint exposing
Running/Free/Capover HTTP.
Production-grade versions (with cancellation, error propagation, and observability) are in professional.md.
Further Reading¶
- The README and Go-Doc of
github.com/panjf2000/ants. - Andy Pan's original blog post introducing the design (linked from the GitHub README).
- The
ants_test.gofile in the repo — many small examples, including benchmarks. - The Go blog post "Pipelines and cancellation" — for the broader picture of bounded concurrency in Go.
golang.org/x/sync/semaphore— a lighter alternative when all you need is "cap N concurrent operations."
Related Topics¶
01-goroutines— thegokeyword, the foundation. Reading the section'sjunior.mdis a prerequisite.02-sync-primitives—sync.WaitGroup,sync.Pool, the building blocksantsitself uses.15-concurrency-anti-patterns— including "unbounded goroutines," the failure modeantsdirectly addresses.16-goroutine-pools-stdlib— building a small pool fromchan func()and a fewWaitGroups. Read it; you'll appreciateantsmore.18-errgroup—errgroup.Groupprovides "wait for all, propagate first error." It composes well withants.19-semaphore—golang.org/x/sync/semaphoreprovides "N concurrent operations" without a pool of workers.
Diagrams¶
Diagram 1 — Submit on a non-full pool¶
caller pool worker
| | |
| Submit(f) --------->| |
| | pop idle worker ------> | (woken)
| | |
| <----- nil err -----| | run f()
| | |
| ... | | f() done
| | <- push self idle -------|
| | |
Diagram 2 — Submit on a full pool (blocking)¶
caller pool worker_5
| | | running task X
| Submit(g) --------->| |
| | no idle worker |
| <----------- | (waiting for one) |
| (blocked) | |
| | | task X done
| | <- push self idle -------|
| | pop worker -----------> |
| <---- nil err ------| | run g()
Diagram 3 — Lifecycle¶
+--------+ Submit (room) +--------+
NewPool --> | active |------------------>| active |
+--------+ Submit (full) +--------+
| (blocked) |
| Release() | Release()
v v
+--------+ +--------+
| closed | | closed |
+--------+ +--------+
Submit Submit
\ /
+--> ErrPoolClosed <-----+
Diagram 4 — Worker stack¶
top of stack -> [worker_3]
[worker_7]
[worker_2]
[worker_5]
bot of stack -> [worker_1]
Submit pops top; finishing worker pushes onto top.
LIFO chosen so that the most recently active worker is reused first —
its cache lines are still hot.
Diagram 5 — Pool sized vs unsized¶
go f() naive : goroutines = O(submitted tasks)
ants.NewPool(N) : goroutines <= N + blocked_submitters + janitor
ants.NewPool(-1) : goroutines = O(submitted tasks) (same as naive)
Pick N to be the smallest number that gives you acceptable throughput. Anything bigger is wasted bound.
Coding Patterns¶
This section gathers the patterns that come up most often when you start building real things with ants. Each one is intentionally small. Combine them as needed.
Pattern 1 — Bounded worker farm with results¶
The most common shape. A producer enqueues, a fixed-size pool processes, results land in a channel that the main goroutine drains.
package main
import (
"fmt"
"sync"
"github.com/panjf2000/ants/v2"
)
type input struct{ id, value int }
type output struct{ id, square int }
func main() {
pool, _ := ants.NewPool(8)
defer pool.Release()
inputs := make([]input, 1000)
for i := range inputs {
inputs[i] = input{id: i, value: i}
}
out := make(chan output, len(inputs))
var wg sync.WaitGroup
for _, in := range inputs {
in := in
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
out <- output{id: in.id, square: in.value * in.value}
})
}
go func() {
wg.Wait()
close(out)
}()
sum := 0
for o := range out {
sum += o.square
}
fmt.Println("sum:", sum)
}
Three rules:
- Buffered
outchannel sized to the number of inputs — guarantees no worker blocks on send. close(out)happens in a separate goroutine that waits for theWaitGroup. Doing it inline would deadlock because the main goroutine is the drainer.- Inputs are by value so the closure captures a copy. Pointers would also work — but then the producer must not mutate them after
Submit.
Pattern 2 — Per-task error bag¶
You want one error per task, not one global error.
type task struct {
id int
err error
}
tasks := make([]task, len(work))
for i, w := range work {
tasks[i].id = i
i, w := i, w
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
tasks[i].err = doWork(w)
})
}
wg.Wait()
for _, t := range tasks {
if t.err != nil {
log.Printf("task %d failed: %v", t.id, t.err)
}
}
Each task writes to a unique slot. No locks needed; the WaitGroup provides happens-before for the final read.
Pattern 3 — First-error short-circuit¶
You want to stop as soon as one task errors. This is what errgroup exists for; here is the manual version.
var (
firstErr error
once sync.Once
cancel = make(chan struct{})
)
setErr := func(err error) {
once.Do(func() {
firstErr = err
close(cancel)
})
}
var wg sync.WaitGroup
for _, w := range work {
w := w
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
select {
case <-cancel:
return
default:
}
if err := doWork(w); err != nil {
setErr(err)
}
})
}
wg.Wait()
if firstErr != nil {
log.Fatal(firstErr)
}
sync.Once ensures only the first error is captured. The cancel channel lets later tasks short-circuit. This pattern is fine for ~10 tasks but ugly at scale — errgroup (see professional.md) is the right answer in production.
Pattern 4 — Fan-out from a stream¶
Your inputs come from a channel, not a slice. The pool is a sink.
package main
import (
"sync"
"github.com/panjf2000/ants/v2"
)
func consume(in <-chan int, pool *ants.Pool) {
var wg sync.WaitGroup
for x := range in {
x := x
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
process(x)
})
}
wg.Wait()
}
func process(x int) { _ = x }
This is the worker farm in stream form. Note how the producer (whatever is feeding in) naturally back-pressures on the consumer's pool: if the pool is full, Submit blocks, the for x := range in loop pauses, the upstream channel fills up, the upstream sender blocks. The whole pipeline self-throttles.
Pattern 5 — Sequential per-key concurrency¶
Sometimes you want concurrent across keys but serial within a key. Trivial with a sync.Map of pools:
type Sharder struct {
pools sync.Map // map[string]*ants.Pool
}
func (s *Sharder) Submit(key string, task func()) error {
p, _ := s.pools.LoadOrStore(key, mustPool(1)) // cap 1 ⇒ serial
return p.(*ants.Pool).Submit(task)
}
func mustPool(cap int) *ants.Pool {
p, _ := ants.NewPool(cap)
return p
}
Each key gets a pool of capacity 1, so within-key order is preserved. Across keys, you have full concurrency. The downside: many tiny pools, and you need a janitor to release pools for stale keys. This is a sketch — for real production use, MultiPool (see senior.md) is the right primitive.
Pattern 6 — Pre-warming¶
If you know a burst is coming, pre-spawn workers:
pool, _ := ants.NewPool(200)
var wg sync.WaitGroup
for i := 0; i < 200; i++ {
wg.Add(1)
_ = pool.Submit(func() { defer wg.Done() })
}
wg.Wait()
// Now Running()==0 but the pool has 200 cached goWorkers in its sync.Pool.
// Burst:
for i := 0; i < 200; i++ {
_ = pool.Submit(realWork)
}
Each pre-warm task is a no-op; afterward, the pool's sync.Pool cache is populated with reusable goWorker structs. The burst pays only the channel-send cost, not the allocation cost.
Rarely necessary, but useful at startup of a service that anticipates a known spike (e.g., a job scheduler fires at midnight).
Pattern 7 — Timeout on Submit¶
Wrap Submit in a select with time.After:
func submitWithTimeout(p *ants.Pool, task func(), d time.Duration) error {
done := make(chan error, 1)
go func() { done <- p.Submit(task) }()
select {
case err := <-done:
return err
case <-time.After(d):
return errors.New("submit timeout")
}
}
This is genuinely useful when you want a guarantee like "if my pool is full for more than 100 ms, return an error rather than block." Note the wrapper itself spawns a goroutine per call — if you call it a lot, this is wasteful. The cleaner alternative is WithNonblocking(true) + WithMaxBlockingTasks(N) from middle.md, plus a retry loop.
Pattern 8 — Submit a typed argument cheaply¶
The Submit(func()) signature forces you to wrap your argument in a closure. Each Submit is an allocation. If you have a tight loop, this matters. Two options:
- Use
PoolWithFunc(next file). The argument isinterface{}and there is no per-call closure allocation. - Use a pre-allocated struct pool to recycle the closure:
type ctx struct {
x int
}
var ctxPool = sync.Pool{New: func() any { return new(ctx) }}
for i := 0; i < 1_000_000; i++ {
c := ctxPool.Get().(*ctx)
c.x = i
_ = pool.Submit(func() {
defer ctxPool.Put(c)
work(c.x)
})
}
Either approach drops the allocation cost. PoolWithFunc is simpler; the sync.Pool approach also works on regular Pool.
Product Use / Feature¶
A few sketches of where you might use ants in real product code at junior level.
Use case A — URL prefetcher¶
A web crawler wants to fetch a list of URLs as fast as possible without overwhelming the target server.
type Crawler struct {
pool *ants.Pool
client *http.Client
}
func NewCrawler(concurrency int) *Crawler {
p, _ := ants.NewPool(concurrency)
return &Crawler{pool: p, client: &http.Client{Timeout: 10 * time.Second}}
}
func (c *Crawler) Close() { c.pool.Release() }
func (c *Crawler) FetchAll(urls []string) []*http.Response {
out := make([]*http.Response, len(urls))
var wg sync.WaitGroup
for i, u := range urls {
i, u := i, u
wg.Add(1)
_ = c.pool.Submit(func() {
defer wg.Done()
resp, err := c.client.Get(u)
if err != nil {
return
}
out[i] = resp
})
}
wg.Wait()
return out
}
Production version would handle errors, rate limit, and respect robots.txt, but the shape is the same.
Use case B — Bulk file hasher¶
Given a directory tree, compute MD5 of every file. Naive code spawns one goroutine per file, which OOMs at scale.
type Hasher struct {
pool *ants.Pool
}
func (h *Hasher) HashAll(paths []string) map[string]string {
hashes := make(map[string]string)
var mu sync.Mutex
var wg sync.WaitGroup
for _, p := range paths {
p := p
wg.Add(1)
_ = h.pool.Submit(func() {
defer wg.Done()
sum := md5sum(p)
mu.Lock()
hashes[p] = sum
mu.Unlock()
})
}
wg.Wait()
return hashes
}
A pool of 32 workers is typically enough to saturate disk on an SSD. Plain go f() would queue thousands of file descriptors and quickly OOM or hit the FD limit.
Use case C — Background notification dispatcher¶
An HTTP server wants to send notifications asynchronously without blocking the request thread.
type Dispatcher struct {
pool *ants.Pool
}
func (d *Dispatcher) Notify(user string, msg string) {
_ = d.pool.Submit(func() {
_ = sendPushNotification(user, msg)
})
}
The request handler returns instantly; the actual send happens later on a pool worker. If the pool is full, Submit blocks (back-pressure on the API), or — better — use non-blocking mode and either drop or enqueue to a real queue (Redis, Kafka).
Use case D — Connection-bounded scraping¶
A scraper that uses a third-party API with a hard 100-concurrent-request limit.
pool, _ := ants.NewPool(100)
defer pool.Release()
for _, q := range queries {
q := q
_ = pool.Submit(func() { scrape(q) })
}
This is the rate limiter. As long as nobody else hits the API from this process, you cannot exceed 100 concurrent requests.
Security Considerations¶
A junior section is the wrong place to go deep on security, but a few notes:
S1 — A panicking task does not crash your program¶
This is mostly good. But it also means a panic-as-side-channel cannot be used to abort the program intentionally — if you want the program to crash on certain conditions, do not rely on panic inside a pooled task. Use os.Exit or log + return.
S2 — Untrusted task functions are a bad idea¶
If you allow callers to submit arbitrary func() values to your pool (think: scripting, plugin systems), they can mutate any package-level state, leak goroutines, or block forever. ants does not sandbox tasks. Either trust your callers or wrap them in a watchdog.
S3 — Resource exhaustion via Submit¶
If your pool is in blocking mode and your tasks come from an untrusted source (e.g., an HTTP handler), an attacker can fill the pool with slow tasks to deny service to other callers. Mitigations:
- Use
WithNonblocking(true)and reject overload. - Set per-task timeouts so no task can hold a worker forever.
- Run a separate pool per tenant / per source.
S4 — Logging panic content¶
The default panic handler logs the panic value via log.Printf. If your tasks can panic with attacker-controlled data, that data ends up in logs. This is the standard logging hygiene issue, not unique to ants — sanitise.
Tricky Points (extended)¶
Tricky 6 — Running immediately after Submit may be 0¶
Submit returns after handing the task off, but the worker has not necessarily entered its function body yet. pool.Running() includes the worker only after it has begun executing. There is a tiny window in which Submit returned but Running did not yet increment.
In practice this is irrelevant for any reasonable program — by the time you read Running for monitoring, dozens of these races have averaged out — but it is worth knowing if you write a test like "after one Submit, Running must be 1." It might not be, instantaneously.
Tricky 7 — Tune does not preempt¶
Tune(N) changes the cap. It does not interrupt running workers. If Running > N, you have to wait for tasks to finish before the cap is effective. There is no Tune(N, killExcess=true) flag.
Tricky 8 — A nil Submit argument crashes the worker¶
pool.Submit(nil) passes nil as the task. The worker tries to call nil() and panics. The pool's recover catches it, so your program survives, but the task is silently dropped. There is no input validation in Submit. Don't pass nil.
Tricky 9 — Submission rate is bound by GOMAXPROCS, not capacity¶
Your submission loop is itself a goroutine. If GOMAXPROCS=1 and your workers are all running tight CPU-bound loops, your submitter never gets scheduled. Symptom: throughput is much lower than capacity / avg_task_duration would predict. Fix: increase GOMAXPROCS, or break long CPU tasks with runtime.Gosched.
Tricky 10 — The "magic" -1 size and MultiPool¶
NewPool(-1) exists for backwards compatibility and for code that wants an unbounded pool while still calling Submit (rather than go). It is semantically equivalent to go f(). The MultiPool (covered in senior.md) does not accept -1 — its per-shard size must be positive.
Test (extended)¶
Q16¶
What does defer pool.Release() do if Release has already been called once?
A. In v2 it is a no-op — Release is idempotent. The second call returns nil and does nothing. In v1 the behaviour was less defensive.
Q17¶
What is the difference between Submit and Invoke?
A. Submit(func()) is on Pool. Invoke(arg interface{}) is on PoolWithFunc — the function is set at pool creation, only the argument varies. Invoke avoids the per-call closure allocation that Submit requires.
Q18¶
You have a pool of cap 100 and you call Tune(10). There are 80 running workers. What does Cap return immediately after, and what does Running return?
A. Cap returns 10 (the new cap). Running returns 80 (the actual count). Over time, as the 80 tasks finish, Running will drop to 10 (or wherever new submits land).
Q19¶
What happens if you Submit to a closed pool?
A. Submit returns ants.ErrPoolClosed. The task is not enqueued.
Q20¶
Does ants use any global state?
A. Yes — ants.Submit (note: package-level function) and ants.Default() use a default global pool. For most apps you should create your own *Pool rather than using the global one. The default pool has math.MaxInt32 capacity, which is effectively unlimited.
Q21¶
How many goroutines does an unused fresh pool consume?
A. One — the janitor (also called the "purge" goroutine). Workers are not pre-allocated.
Q22¶
If I want my submitter to never block, what should I do?
A. ants.NewPool(N, ants.WithNonblocking(true)). Then Submit either succeeds immediately or returns ErrPoolOverload. Always check the error.
Q23¶
What is WithMaxBlockingTasks for?
A. In default (blocking) mode, it caps how many goroutines can be simultaneously blocked inside Submit. Caller N+1 gets ErrPoolOverload. Default is 0 (unlimited blocked callers).
Q24¶
Why might Free() return 0 even though I just released a worker?
A. Because between "task finished" and "worker returned to idle stack" there is a microscopic window. Or because another Submit already grabbed it. Free is best-effort, not transactionally precise.
Q25¶
Does the pool serialise tasks if I set cap=1?
A. Yes. With cap=1 there is exactly one worker, so tasks run strictly one at a time, in submission order (within one submitter). It is a glorified mutex.
More Code Examples¶
Example 8 — Latency profile of Submit¶
How long does Submit take? Tiny program to find out:
package main
import (
"fmt"
"sort"
"time"
"github.com/panjf2000/ants/v2"
)
func main() {
pool, _ := ants.NewPool(100)
defer pool.Release()
const N = 10000
durs := make([]time.Duration, N)
for i := 0; i < N; i++ {
start := time.Now()
_ = pool.Submit(func() {})
durs[i] = time.Since(start)
}
sort.Slice(durs, func(i, j int) bool { return durs[i] < durs[j] })
fmt.Printf("median: %v p99: %v max: %v\n",
durs[N/2], durs[N*99/100], durs[N-1])
}
On a modern laptop you'll see medians under 1 µs with occasional p99 spikes when the worker stack is empty and a new goroutine has to be spawned. That overhead is what you're paying. For non-trivial tasks (anything taking > 100 µs), it's negligible.
Example 9 — Hammered submit¶
What happens when 1000 producers hammer one pool with capacity 100?
package main
import (
"fmt"
"sync"
"sync/atomic"
"time"
"github.com/panjf2000/ants/v2"
)
func main() {
pool, _ := ants.NewPool(100)
defer pool.Release()
var done int64
var producers sync.WaitGroup
start := time.Now()
for p := 0; p < 1000; p++ {
producers.Add(1)
go func() {
defer producers.Done()
for i := 0; i < 100; i++ {
_ = pool.Submit(func() {
time.Sleep(1 * time.Millisecond)
atomic.AddInt64(&done, 1)
})
}
}()
}
producers.Wait()
for atomic.LoadInt64(&done) < 100000 {
time.Sleep(10 * time.Millisecond)
}
fmt.Println("elapsed:", time.Since(start))
}
1000 producers × 100 submits = 100k tasks, each 1 ms, on a pool of 100. Lower bound is 100000/100 * 1ms = 1s. Real time will be ~1.1 s plus overhead. Notice that you have 1000 blocked submitters at any moment, which is 1000 goroutines on top of your 100 workers. That's fine — they're cheap — but if you started 10 million producers you'd want a queue instead.
Example 10 — Mixed workloads in one program¶
package main
import (
"fmt"
"sync"
"github.com/panjf2000/ants/v2"
)
func main() {
cpuPool, _ := ants.NewPool(8) // = GOMAXPROCS
ioPool, _ := ants.NewPool(200)
defer cpuPool.Release()
defer ioPool.Release()
var wg sync.WaitGroup
for i := 0; i < 1000; i++ {
wg.Add(2)
_ = cpuPool.Submit(func() { defer wg.Done(); cpuWork() })
_ = ioPool.Submit(func() { defer wg.Done(); ioWork() })
}
wg.Wait()
fmt.Println("done")
}
func cpuWork() { _ = 1 << 20 }
func ioWork() { _ = 1 << 20 }
Two pools sized for their work. Same WaitGroup. Same goroutine-safe pattern. This is the most realistic shape of a real service.
Example 11 — Submit with retry¶
If you use non-blocking mode and get overloaded, retry:
package main
import (
"time"
"github.com/panjf2000/ants/v2"
)
func submitOrRetry(p *ants.Pool, task func()) error {
for {
err := p.Submit(task)
if err == nil {
return nil
}
if err == ants.ErrPoolClosed {
return err
}
// err == ErrPoolOverload
time.Sleep(time.Millisecond)
}
}
In real code use exponential backoff, a max retry count, and a context for cancellation. The sketch shows the idea.
Example 12 — Submit + select on context¶
A common need: "submit this task, but if my context is cancelled before it runs, give up."
package main
import (
"context"
"errors"
"github.com/panjf2000/ants/v2"
)
func submitCtx(ctx context.Context, p *ants.Pool, task func()) error {
done := make(chan error, 1)
go func() {
done <- p.Submit(func() {
select {
case <-ctx.Done():
return
default:
}
task()
})
}()
select {
case err := <-done:
return err
case <-ctx.Done():
return errors.New("submit cancelled")
}
}
Two layers of select-on-context: one outside Submit so a long-block on Submit is cancellable, one inside the task so a task that hasn't started yet can short-circuit. Real code would push the spawn-a-goroutine-per-submit cost down, but the pattern is clear.
Example 13 — Counting in-flight tasks externally¶
Sometimes you want to know how many submitted-but-not-finished tasks there are (which is not Running — that's the worker count, not the task count).
package main
import (
"fmt"
"sync/atomic"
"time"
"github.com/panjf2000/ants/v2"
)
func main() {
pool, _ := ants.NewPool(50)
defer pool.Release()
var inflight int64
for i := 0; i < 1000; i++ {
atomic.AddInt64(&inflight, 1)
_ = pool.Submit(func() {
defer atomic.AddInt64(&inflight, -1)
time.Sleep(time.Millisecond)
})
}
for atomic.LoadInt64(&inflight) > 0 {
fmt.Println("inflight:", atomic.LoadInt64(&inflight))
time.Sleep(50 * time.Millisecond)
}
}
This counter is yours, not the pool's. It includes tasks that are blocked in Submit waiting for a worker. Useful for "drain me before shutdown" logic.
Example 14 — Submitting a method value¶
A common stumble: trying to submit a method directly.
type Worker struct{ id int }
func (w *Worker) Process() { fmt.Println(w.id) }
func main() {
pool, _ := ants.NewPool(4)
defer pool.Release()
w := &Worker{id: 1}
_ = pool.Submit(w.Process) // legal: method value matches func()
}
w.Process is a method value with signature func(), so Submit accepts it. If your method takes arguments, you need a closure.
Example 15 — Submitting many similar tasks via a slice¶
A frequent pattern. We've seen pieces of this, but here it is end-to-end.
type batch struct {
rows []row
err error
}
func processBatches(pool *ants.Pool, batches []*batch) {
var wg sync.WaitGroup
for _, b := range batches {
b := b
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
b.err = processBatch(b.rows)
})
}
wg.Wait()
for i, b := range batches {
if b.err != nil {
log.Printf("batch %d failed: %v", i, b.err)
}
}
}
func processBatch(rows []row) error { _ = rows; return nil }
type row struct{}
Each batch is its own struct; closures capture pointers to it. After wg.Wait() you have per-batch errors. This is one of the most production-ready patterns for "process N independent units."
Performance Tips (extended)¶
Tip 6 — Use PoolWithFunc for hot loops¶
If you're submitting the same function with different arguments at high rate, PoolWithFunc (next file) skips the closure allocation. For low-throughput code (a few thousand submits per second), Pool is simpler and fine.
Tip 7 — Watch GC under sustained submit¶
Each Submit of a closure allocates the closure on the heap. Sustained millions per second can move the GC pressure noticeably. If you see GC overhead in pprof, switch to PoolWithFunc or reuse closures via sync.Pool.
Tip 8 — GOMAXPROCS matters for CPU-bound pools¶
If your pool serves CPU-bound tasks, sizing it past GOMAXPROCS is wasted. The extra workers will just contend for the same set of OS threads. For CPU work, runtime.GOMAXPROCS(0) is the right starting point.
Tip 9 — One pool is rarely a bottleneck for the pool itself¶
Most of the time, the pool's own throughput is not your limit — your tasks are. If pprof shows ants.Submit as a hotspot under your workload, you have a lot of submission going on, and you should consider MultiPool or PoolWithFunc.
Tip 10 — Don't mix metrics across pools¶
If you Running() two pools and add the numbers, the sum is meaningful but not what you'd think. The two pools may be sized differently; the right thing is to report each pool's Running/Cap ratio and treat them as separate signals.
Edge Cases (extended)¶
Edge 11 — Tune(0) is invalid¶
You cannot tune to 0. The library rejects it (returns silently or errors depending on version). To "stop accepting work" you Release the pool, not Tune(0).
Edge 12 — Tune to a size equal to current cap is a no-op¶
Trivial, but worth knowing: Tune is cheap and safe to call even if it does nothing.
Edge 13 — Janitor wake-up frequency = ExpiryDuration / 10 in some versions¶
By default the janitor wakes every ~100 ms (ExpiryDuration / 10, default 1 second). If your tasks are sub-microsecond and the pool is constantly busy, the janitor doesn't help much — workers are always reused. If your pool has long idle stretches, the janitor frees memory.
Edge 14 — nil PanicHandler resets to default¶
If you call WithPanicHandler(nil), you fall back to the default log.Printf behaviour. Some teams use this as a feature flag — "in tests, set panic handler to t.Fatalf; in prod, leave default."
Edge 15 — Submitting func() with named arguments via reflection¶
Pointless trivia: you cannot submit a reflect.Value directly. Wrap it in a closure that calls value.Call(args). The pool sees only the closure.
Common Mistakes (extended)¶
Mistake 11 — Submitting tasks that block on an unrelated lock¶
If a task waits on a lock held by another task in the same pool, and the pool is full, you can deadlock. Example: pool cap=1, task A holds lock L and submits task B which needs L. A is blocked waiting for B, B is blocked waiting for A's worker slot. Deadlock.
Rule of thumb: tasks should not hold locks across Submit boundaries.
Mistake 12 — Treating pool.Running() as task count¶
Running is workers, not tasks. A submitted but not-yet-running task does not show up in Running. A submitted task being received by a worker also doesn't show until it starts. Use your own counter for "in-flight tasks" if you need that precision.
Mistake 13 — Comparing Submit to go f() performance for one-off calls¶
For one task, Submit is slower than go. The pool's advantage shows up with many tasks where worker reuse amortises the setup. Benchmark with realistic batch sizes.
Mistake 14 — Forgetting that Release does not wait¶
pool, _ := ants.NewPool(10)
for ... { pool.Submit(...) }
pool.Release()
// Surprise: some tasks may still be running.
Release is synchronous on cleanup of idle workers, asynchronous w.r.t. in-flight tasks. To wait for tasks: use a WaitGroup, or call ReleaseTimeout(d) (v2.7+).
Mistake 15 — Calling Tune from inside a task¶
Tune is goroutine-safe, so this won't crash. But it's almost always a code smell — you're letting your tasks change their own concurrency limit. Tune from a control plane (a goroutine watching metrics), not from work goroutines.
Common Misconceptions (extended)¶
Misconception 9 — "Pools make programs faster."¶
Pools make programs more predictable and sometimes more efficient. They rarely make programs faster in raw wall-clock terms — go f() is usually as fast or faster for everything except sustained high-rate workloads. The win is in resource control, not speed.
Misconception 10 — "Cap should equal GOMAXPROCS."¶
Only for CPU-bound work. For I/O-bound work, cap is set by your downstream tolerance (max connections, rate limits, etc.), often much larger than GOMAXPROCS.
Misconception 11 — "Submit is a queue operation."¶
It is not. There is no queue. Submit either (a) immediately assigns a worker, (b) immediately rejects, or (c) blocks the caller. The caller is the queue.
Misconception 12 — "Pools provide fairness."¶
They don't. There is no priority, no fairness guarantee, and certainly no first-come-first-served between independent submitter goroutines. Submitters compete via the Go scheduler.
Misconception 13 — "If I Release, the pool is safe to GC immediately."¶
The pool struct is GC-eligible when no references remain. But the internal goroutines have all received their stop signal in Release, so by the time Release returns, they're gone. The struct itself goes when the last user drops the reference.
Best Practices (extended)¶
BP 11 — Use the pool that fits the workload¶
CPU work: cap = GOMAXPROCS. I/O work: cap = upstream concurrency budget. Mixed work: split into two pools.
BP 12 — Treat the pool as a dependency¶
If your Service struct uses a pool, accept it as a constructor argument, not a package-level global. This makes tests trivial and lets you swap implementations.
BP 13 — Always include pool.IsClosed() checks in long-lived loops¶
If your producer runs forever, periodically check pool.IsClosed() (v2 helper) to break out gracefully during shutdown.
BP 14 — Log when the pool reaches saturation¶
Once Running == Cap for too long, you know your pool is your bottleneck. Emit a metric: if pool.Free() == 0 { metrics.PoolSaturated.Inc() }. This is your earliest warning.
BP 15 — Don't size pools by guess¶
Measure. Either record Free() over time and plot it (it should hover above zero), or load-test with a known input rate and find the cap that gives 95th percentile latency under your SLO.
What You Can Build (extended)¶
After understanding everything above, you can build:
- Bounded HTTP client. A wrapper that holds a pool of N workers, each making outbound requests. Acts as a hard concurrency cap.
- Parallel CSV processor. Read a 10 GB CSV row by row, dispatch to workers, write results back to another CSV.
- Image thumbnailer service. REST endpoint takes a directory path, returns thumbnails. Pool sized by CPU count.
- Bulk email sender. Read mailing list, dispatch to a pool, respect provider's concurrent-send limit.
- Web scraper. Crawl with bounded concurrency, exact behaviour of the URL prefetcher in the use cases above.
- Background indexer. Subscribe to a stream of events, submit each to a worker that updates a search index.
- Periodic poller. A ticker submits a check to the pool every N seconds; the pool ensures no more than M checks run concurrently.
Each is ~50–200 lines once you wire up the pool, the WaitGroup, and the actual work.
More Cheat Sheet¶
Quick reference of common code shapes¶
// Fan-out with results
out := make(chan R, N)
var wg sync.WaitGroup
for _, in := range inputs {
in := in
wg.Add(1)
_ = pool.Submit(func() { defer wg.Done(); out <- process(in) })
}
go func() { wg.Wait(); close(out) }()
for r := range out { use(r) }
// Per-task error bag
errs := make([]error, len(inputs))
for i, in := range inputs {
i, in := i, in
wg.Add(1)
_ = pool.Submit(func() { defer wg.Done(); errs[i] = process(in) })
}
wg.Wait()
// Single-error short-circuit (manual; use errgroup in prod)
var once sync.Once
var firstErr error
for _, in := range inputs {
in := in
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
if err := process(in); err != nil {
once.Do(func() { firstErr = err })
}
})
}
wg.Wait()
Quick error-handling reference¶
switch err := pool.Submit(task); {
case err == nil:
// accepted
case errors.Is(err, ants.ErrPoolClosed):
// shutting down, propagate
case errors.Is(err, ants.ErrPoolOverload):
// drop, retry, or fall back to inline
default:
// unexpected; log
}
Self-Assessment Checklist (extended)¶
You should also be able to:
- Explain the difference between
SubmitandInvoke. - Know when to choose
PoolvsPoolWithFunc(middle.md). - Draw the worker LIFO stack on paper and explain why it is LIFO.
- List the four sentinel errors of
antsand what triggers each. - Convert a naive
go f()loop to a pooled version, including correct closure capture. - Diagnose a hang where a task is blocked on a lock held by another task in the same pool.
- Estimate, given a task duration and a pool cap, the steady-state throughput.
Summary (extended)¶
You learned:
- What
antsis — a goroutine pool library, the most popular one in Go. - Why you'd use it — to bound concurrency or to recycle workers under high task rate.
- The smallest workflow —
NewPool,Submit,Release, with aWaitGroupto coordinate. - The lifecycle — created, active, closed;
Releaseis asynchronous w.r.t. in-flight tasks. - Common patterns — fan-out with results, per-task errors, short-circuit, mixed pools.
- Common mistakes — leaking
Release, captured loop variables, ignoringSubmit's error. - Common pitfalls —
Submitcan block forever, panic does not propagate, nil tasks are silently dropped. - The mental model — pool = bounded executor + LIFO worker stack + callers-as-queue.
In middle.md you'll meet PoolWithFunc, the functional-options API (WithExpiryDuration, WithPanicHandler, WithNonblocking, WithMaxBlockingTasks), and the operational tools (Tune, ReleaseTimeout). In senior.md you'll meet the internals — worker stack, lock-free fast path, locker fallback, MultiPool, MGRR strategies. In professional.md you'll meet production patterns: observability, multi-tenant pools, graceful shutdown, errgroup integration.
A Walk-through Project — Bounded Web Crawler¶
To consolidate everything, let's build a small but realistic project: a bounded web crawler.
Requirements¶
- Read a list of URLs from a text file.
- Fetch each URL, capping concurrency at 50.
- Save the body of each successful response to
out/<sanitised-url>. - Print a summary at the end: total, success, failure.
Step 1 — Skeleton¶
package main
import (
"flag"
"log"
"github.com/panjf2000/ants/v2"
)
func main() {
in := flag.String("in", "urls.txt", "input file")
out := flag.String("out", "out", "output directory")
concurrency := flag.Int("c", 50, "concurrency")
flag.Parse()
pool, err := ants.NewPool(*concurrency)
if err != nil {
log.Fatal(err)
}
defer pool.Release()
_ = in
_ = out
}
The pool is created, defer-released, and parameterised. Nothing exciting yet.
Step 2 — Read URLs¶
func readURLs(path string) ([]string, error) {
f, err := os.Open(path)
if err != nil {
return nil, err
}
defer f.Close()
var urls []string
s := bufio.NewScanner(f)
for s.Scan() {
line := strings.TrimSpace(s.Text())
if line != "" {
urls = append(urls, line)
}
}
return urls, s.Err()
}
Standard. Errors are propagated to main.
Step 3 — Fetch one URL¶
func fetch(client *http.Client, url, dst string) error {
resp, err := client.Get(url)
if err != nil {
return err
}
defer resp.Body.Close()
if resp.StatusCode >= 400 {
return fmt.Errorf("%s: %s", url, resp.Status)
}
f, err := os.Create(dst)
if err != nil {
return err
}
defer f.Close()
_, err = io.Copy(f, resp.Body)
return err
}
Simple HTTP-GET-and-save. Real production would handle redirects, retries, timeouts more carefully.
Step 4 — Wire the pool¶
func crawl(pool *ants.Pool, urls []string, outDir string) (success, fail int) {
client := &http.Client{Timeout: 30 * time.Second}
var wg sync.WaitGroup
var s, f int64
for _, u := range urls {
u := u
dst := filepath.Join(outDir, sanitise(u))
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
if err := fetch(client, u, dst); err != nil {
atomic.AddInt64(&f, 1)
return
}
atomic.AddInt64(&s, 1)
})
}
wg.Wait()
return int(s), int(f)
}
func sanitise(u string) string {
return strings.NewReplacer("/", "_", ":", "_", "?", "_", "&", "_").Replace(u)
}
The pool caps concurrency. WaitGroup waits for all tasks. Atomic counters tally success/failure.
Step 5 — Put it all together¶
package main
import (
"bufio"
"flag"
"fmt"
"io"
"log"
"net/http"
"os"
"path/filepath"
"strings"
"sync"
"sync/atomic"
"time"
"github.com/panjf2000/ants/v2"
)
func main() {
in := flag.String("in", "urls.txt", "input file")
out := flag.String("out", "out", "output directory")
concurrency := flag.Int("c", 50, "concurrency")
flag.Parse()
urls, err := readURLs(*in)
if err != nil {
log.Fatal(err)
}
if err := os.MkdirAll(*out, 0o755); err != nil {
log.Fatal(err)
}
pool, err := ants.NewPool(*concurrency)
if err != nil {
log.Fatal(err)
}
defer pool.Release()
start := time.Now()
s, f := crawl(pool, urls, *out)
fmt.Printf("total=%d ok=%d fail=%d elapsed=%v\n",
len(urls), s, f, time.Since(start))
}
// (readURLs, fetch, crawl, sanitise as above)
This entire program is ~80 lines and demonstrates every junior concept: pool creation, defer release, fan-out, captured loop variable (via u := u), WaitGroup, error counting, parameterised capacity. You can extend it (retries, robots.txt, rate limiting) without restructuring.
Step 6 — Try misuse¶
Now try removing defer pool.Release(). Run the program. Then add time.Sleep(10 * time.Second) at the end and runtime.NumGoroutine() before each sleep. You will see the worker goroutines persist until the janitor cleans them up — visible evidence of why Release matters.
Try removing the u := u. With Go 1.21, you'll get a thousand fetches of the last URL only. With Go 1.22+, you might not see the bug at all. Either way, learn to spot the missing shadow.
Try setting concurrency to 1. You will run sequentially, ~50× slower for a list of 50 URLs. This demonstrates why the cap matters.
Try setting concurrency to 10000. The program runs fast initially, then your machine hits its file-descriptor limit (ulimit -n) and connections start failing. The pool is doing its job — it would have prevented this if you had picked a sane size.
Step 7 — Add Observability¶
Add a goroutine that prints pool stats every second:
go func() {
t := time.NewTicker(time.Second)
defer t.Stop()
for range t.C {
log.Printf("pool running=%d free=%d cap=%d", pool.Running(), pool.Free(), pool.Cap())
}
}()
You will see running=50 free=0 for most of the run, then running=0 free=50 near the end as the queue drains. That's healthy saturation. If you see running stuck well below cap, your bottleneck is upstream (input file too small? slow disk?) not the pool. If you see free=0 constantly, you might want to bump the cap (assuming downstream allows).
Step 8 — Add Graceful Shutdown¶
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
go func() {
<-sigs
log.Println("shutdown received, releasing pool")
pool.Release()
}()
When you hit Ctrl-C, the pool releases. In-flight fetches finish; queued submitters get ErrPoolClosed. Your tally at the end won't include the cancelled URLs. For a stronger guarantee, plumb a context.Context into each fetch and cancel it on signal.
This walkthrough is everything a junior should aim for. The next file teaches you how to make the same crawler more robust — non-blocking submit, panic handler, dynamic tuning — by introducing the functional options API.
Closing Notes¶
You have made it through 3000+ lines of "junior" content. That seems like a lot for an apparently simple library, but the depth is intentional: most beginners think they understand ants after reading the README, and then they make four of the ten mistakes listed above in their first production pool. The goal of this file is for you not to be one of them.
A few habits to internalise:
defer pool.Release()is the first line afterNewPool. Always.- Closures capture by reference. Shadow your loop variable. Always.
Submitreturnserror. Handle it. Always.- A panic in a task does not crash your program. Install a panic handler, especially in production.
- The pool is not a queue. Callers are. If you want a real queue, build one in front.
- One pool per logical workload. Don't share between CPU and I/O.
- Size by measurement, not by guess. Add a metric, run a load test.
- The pool does not interrupt running tasks. Plumb a context if you need cancellation.
The next file (middle.md) is where the API becomes powerful and you graduate from "use the pool" to "configure the pool correctly for production."
Appendix A — Comparing Five Concurrency Idioms¶
To put ants in context, here are the five most common ways to do bounded concurrency in Go. Reading this side by side helps you decide when ants is the right tool.
Idiom 1 — Plain go f()¶
Pros: simplest, zero dependencies. Cons: unbounded, no waiting, no error propagation. Suitable when items is small and bounded.
Idiom 2 — Worker channel¶
jobs := make(chan T)
var wg sync.WaitGroup
for i := 0; i < N; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for j := range jobs {
work(j)
}
}()
}
for _, x := range items {
jobs <- x
}
close(jobs)
wg.Wait()
Pros: no dependencies, explicit, very flexible. Cons: more boilerplate, you handle errors and tuning manually. Suitable for medium-complexity in-process workers.
Idiom 3 — golang.org/x/sync/semaphore¶
sem := semaphore.NewWeighted(int64(N))
var wg sync.WaitGroup
for _, x := range items {
if err := sem.Acquire(ctx, 1); err != nil { break }
x := x
wg.Add(1)
go func() {
defer sem.Release(1)
defer wg.Done()
work(x)
}()
}
wg.Wait()
Pros: works with context, lightweight. Cons: still spawns goroutines per item — the semaphore caps concurrency but doesn't reuse goroutines. Suitable when you need context-aware acquisition.
Idiom 4 — errgroup with limit¶
g, ctx := errgroup.WithContext(ctx)
g.SetLimit(N)
for _, x := range items {
x := x
g.Go(func() error { return work(ctx, x) })
}
return g.Wait()
Pros: first-error short-circuit, context propagation, idiomatic. Cons: still spawns goroutines per task. Suitable for one-off batch operations.
Idiom 5 — ants pool¶
pool, _ := ants.NewPool(N)
defer pool.Release()
var wg sync.WaitGroup
for _, x := range items {
x := x
wg.Add(1)
_ = pool.Submit(func() { defer wg.Done(); work(x) })
}
wg.Wait()
Pros: worker reuse, low per-task overhead, observability. Cons: separate dependency, no built-in error/context support. Suitable for sustained high-throughput workloads.
When to pick each¶
| If you have... | Use |
|---|---|
| A handful of tasks (<100 items) | Plain go f() |
| Need first-error short-circuit | errgroup |
| Need context-aware acquisition | semaphore |
| Need both context and a real pool | errgroup over ants (see professional.md) |
| Need a long-lived worker farm, sustained high rate | ants |
| Need millions of tasks/sec, allocations matter | ants.PoolWithFunc |
| Need multi-tenant isolation | ants.MultiPool |
Most real services end up combining: errgroup for the request-level fan-out, ants underneath for the actual worker recycling.
Appendix B — A short script to demonstrate the cost of unbounded goroutines¶
Run this. Notice peak memory.
package main
import (
"fmt"
"runtime"
"sync"
"time"
)
func main() {
var wg sync.WaitGroup
const N = 500_000
for i := 0; i < N; i++ {
wg.Add(1)
go func() {
defer wg.Done()
time.Sleep(2 * time.Second)
}()
}
time.Sleep(500 * time.Millisecond)
var m runtime.MemStats
runtime.ReadMemStats(&m)
fmt.Printf("alloc=%d MB goroutines=%d\n",
m.HeapAlloc/1024/1024, runtime.NumGoroutine())
wg.Wait()
}
On a typical laptop you'll see something like:
Now do the same with ants:
pool, _ := ants.NewPool(1000)
defer pool.Release()
for i := 0; i < N; i++ {
wg.Add(1)
_ = pool.Submit(func() {
defer wg.Done()
time.Sleep(2 * time.Second)
})
}
Peak memory drops to ~20 MB. Goroutine count peaks around 1000 + blocked submitters (which are very cheap, mostly idle). That's the whole pitch in one experiment.
Appendix C — Common reading errors of the source¶
When you do read the source in senior.md, beware:
goWorker.taskis achan func()of size 1, not a buffer.- The "idle stack" is actually a slice with
pop/pushsemantics from the end (LIFO). purgeStaleWorkersis the janitor — not "purge all workers."Pool.lockis async.Locker, but in some build tags it isinternal.Spinlock.
Even the names are sometimes misleading. The diagrams and walkthroughs in senior.md will help.