Worker Pools in Go with a Supervisor Pattern

One of Go's strengths is making concurrent work feel natural. A common pattern that shows up across many production systems is the worker pool with a supervisor — a coordinator that fans out work to a pool of goroutines, waits for them to finish, and aggregates results. Here's a walk through the pattern with the kind of structure you'd actually use in production.

The Problem

You have a stream of work items that can be processed independently. You want to:

  1. Distribute the work across multiple goroutines for parallelism
  2. Collect results as workers finish them
  3. Know when all work is complete
  4. Compute a final aggregate result from individual outputs

Common examples: processing news articles in a financial data pipeline, running parallel HTTP requests, transforming records in a batch job, computing checksums across many files.

The Building Blocks

Go gives us three primitives that compose cleanly for this:

The pattern uses these together rather than choosing between them.

A Concrete Example

Let's compute something simple — sum of squares across a slice of numbers — to keep the focus on the concurrency pattern rather than the work itself.

package main

import (
    "fmt"
    "sync"
)

// WorkItem represents a unit of work to be processed
type WorkItem struct {
    ID    int
    Value int
}

// Result represents the output of processing one work item
type Result struct {
    WorkID int
    Output int
}

// worker processes items from the jobs channel and sends results to results channel
func worker(id int, jobs <-chan WorkItem, results chan<- Result, wg *sync.WaitGroup) {
    defer wg.Done()
    for job := range jobs {
        // Simulate the actual work — compute square of the value
        output := job.Value * job.Value
        results <- Result{
            WorkID: job.ID,
            Output: output,
        }
        fmt.Printf("worker %d processed job %d\n", id, job.ID)
    }
}

The worker function is small. It reads from a jobs channel until the channel closes, processes each item, and writes results to a results channel. The defer wg.Done() ensures the WaitGroup decrements even if the function panics.

The Supervisor

The supervisor coordinates everything — it launches workers, distributes work, waits for completion, and aggregates results.

func supervisor(items []WorkItem, numWorkers int) int {
    jobs := make(chan WorkItem, len(items))
    results := make(chan Result, len(items))
    var wg sync.WaitGroup

    // Launch the worker pool
    for w := 1; w <= numWorkers; w++ {
        wg.Add(1)
        go worker(w, jobs, results, &wg)
    }

    // Distribute work items to the jobs channel
    for _, item := range items {
        jobs <- item
    }
    close(jobs) // Signal workers that no more jobs are coming

    // Wait for all workers to finish, then close the results channel
    go func() {
        wg.Wait()
        close(results)
    }()

    // Collect and aggregate results
    total := 0
    for result := range results {
        total += result.Output
    }

    return total
}

A few things worth noting in this code:

The jobs channel is buffered to the size of the work — this prevents the supervisor from blocking when sending work. In production, you might use a smaller buffer to apply backpressure if workers fall behind.

The close(jobs) is essential. Without it, workers would block forever waiting for more work. Closing the channel signals "no more items coming," which causes the range loop in workers to exit.

The goroutine that calls wg.Wait() and then close(results) is the bridge between the workers finishing and the supervisor knowing it can stop reading results. Without this, the supervisor's range results loop would block forever even after all workers finish.

Putting It Together

func main() {
    // Generate some work
    items := make([]WorkItem, 100)
    for i := 0; i < 100; i++ {
        items[i] = WorkItem{ID: i, Value: i + 1}
    }

    // Process with 5 workers
    total := supervisor(items, 5)
    fmt.Printf("Sum of squares: %d\n", total)
}

Running this distributes the 100 work items across 5 worker goroutines. Each worker pulls items off the jobs channel, computes the square, and pushes the result. The supervisor sums all results and returns the total.

For numbers 1 through 100, the result is 338,350.

Production Considerations

The basic pattern above works, but real systems need more:

Error handling: Workers should communicate errors back to the supervisor. Add an errors channel or include an Error field in the Result struct.

Context cancellation: Use context.Context to allow the supervisor to cancel work if conditions change (timeout, parent operation cancelled, etc.).

Rate limiting: For work that hits external services, add a rate limiter (often using a token bucket or golang.org/x/time/rate).

Metrics: Production code wants observability — how many items processed, error rates, latency distribution. Add hooks for instrumentation.

Graceful shutdown: Handle signals (SIGTERM, SIGINT) to drain in-flight work before exiting.

Here's a sketch of how context cancellation slots in:

func workerWithContext(ctx context.Context, id int, jobs <-chan WorkItem, results chan<- Result, wg *sync.WaitGroup) {
    defer wg.Done()
    for {
        select {
        case job, ok := <-jobs:
            if !ok {
                return // jobs channel closed, work done
            }
            output := job.Value * job.Value
            select {
            case results <- Result{WorkID: job.ID, Output: output}:
                // result sent successfully
            case <-ctx.Done():
                return // cancelled while trying to send
            }
        case <-ctx.Done():
            return // cancelled while waiting for work
        }
    }
}

The select statement lets the worker respond to either incoming work or cancellation signals. This is the standard Go pattern for making concurrent code cancellable.

Why This Pattern Works

The supervisor-with-worker-pool pattern composes Go's primitives in a way that handles the common requirements of concurrent work:

Each primitive does one thing well, and they compose into a system that handles the complexity of concurrent execution without requiring explicit thread management or locks for the main data flow.

This is one of the patterns that makes Go genuinely good for systems that need to do parallel work efficiently — financial data ingestion pipelines, parallel API processing, batch job systems, anything where you have many independent units of work to process and want to use available cores effectively.

Where to Go From Here

The pattern extends naturally to more complex scenarios:

The fundamental structure stays the same. Goroutines do the work, channels carry the data, WaitGroups synchronize completion, and a supervisor coordinates the whole thing. Master the basic pattern and the variations follow naturally.