What I learned building Golang microservices that handled thousands of requests per second — service boundaries, graceful shutdown, and the patterns that survived production.
At InvoZone, I was the new hire on a team building Go microservices for a client's backend system. The system handled real-time data ingestion from IoT devices — thousands of events per second, each needing to be validated, transformed, and routed to the right downstream consumer.
I came from a Node.js background. Go was different in ways that took me months to fully appreciate. These are the lessons that stuck.
Why Go
The team had evaluated three options: keep scaling the existing Node.js monolith, rewrite in Java, or rewrite in Go.
Node.js hit its ceiling when the event loop became the bottleneck. The ingestion pipeline was CPU-bound — validation, JSON parsing, transformation logic — and Node's single-threaded model meant we were running 16 instances per box to use all cores, each with its own memory overhead and no shared state.
Java was the safe enterprise choice, but the team was small and the deployment target was Kubernetes. Go's static binaries, fast startup times, and small container images made it operationally simpler. A Go container image was 15MB. The equivalent Java image with a JVM was 300MB+.
Go won on operational simplicity, not language features.
Service Boundaries
We split the monolith into four services:
- Ingestor — accepts raw events over HTTP and gRPC, validates schema, writes to a message queue
- Transformer — consumes from the queue, applies business rules, normalises data formats
- Router — reads transformed events and fans them out to topic-specific consumers
- API — serves processed data to the client's dashboard via REST
The boundaries weren't arbitrary. Each service scaled independently based on its bottleneck. The Ingestor was network-bound (lots of connections, small payloads). The Transformer was CPU-bound (complex validation logic). The Router was I/O-bound (writing to multiple downstream topics). The API was memory-bound (caching recent data for dashboard queries).
Concurrency Patterns
Go's goroutines and channels were the biggest adjustment from Node.js. In Node, concurrency is implicit — everything is async, and you manage it with Promises and callbacks. In Go, concurrency is explicit — you choose when to spawn goroutines and how they communicate.
The pattern we used most was worker pools. The Ingestor accepted connections on the main goroutine and dispatched work to a pool of N workers via a buffered channel:
type Worker struct {
jobs <-chan Event
results chan<- Result
}
func (w *Worker) Start() {
for event := range w.jobs {
result := process(event)
w.results <- result
}
}
func StartPool(size int, jobs <-chan Event, results chan<- Result) {
for i := 0; i < size; i++ {
w := &Worker{jobs: jobs, results: results}
go w.Start()
}
}The pool size was tuned to match available CPU cores. Too few workers and we'd underutilise the box. Too many and context switching overhead ate into throughput. We settled on runtime.NumCPU() * 2 after benchmarking — the extra factor of 2 accounted for I/O wait in the processing pipeline.
Graceful Shutdown
This is the thing every Go tutorial mentions but nobody implements properly the first time. When Kubernetes sends a SIGTERM, you need to:
- Stop accepting new connections
- Finish processing in-flight requests
- Flush any buffered data to the message queue
- Close database connections and downstream clients
- Exit
The naive approach — os.Exit(0) in a signal handler — drops in-flight requests and corrupts buffered data. We used context.WithCancel threaded through the entire call chain:
ctx, cancel := context.WithCancel(context.Background())
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGTERM, syscall.SIGINT)
go func() {
<-sigCh
cancel()
}()
server.Serve(ctx)Every function in the processing chain took a context.Context and checked for cancellation. When the context was cancelled, workers finished their current item and drained. The main goroutine waited for all workers to exit before shutting down.
The timeout was critical — we gave workers 25 seconds to drain (Kubernetes default grace period is 30 seconds). If a worker was still running at 25 seconds, we logged a warning and force-exited. Losing one event was better than Kubernetes killing the pod with a SIGKILL and losing all buffered state.
Error Handling
Go's explicit error handling felt tedious coming from Node's try-catch, but it prevented an entire class of bugs. In Node, an unhandled Promise rejection could crash the process or silently swallow an error. In Go, if you don't check err, the code doesn't compile (with proper linting).
We adopted a rule: every error is either handled or propagated with context. No bare return err. Every error return added context about what operation failed:
if err != nil {
return fmt.Errorf("transform event %s: %w", event.ID, err)
}By the time an error reached the top-level handler, it read like a stack trace: "route event abc123: transform event abc123: parse timestamp: invalid format". No need to grep logs for the source.
What I'd Do Differently
Start with structured logging from day one. We used log.Printf for the first two months and then had to retrofit zerolog across all four services. Every format string had to be rewritten. Structured logging should be in the project template, not an afterthought.
Use gRPC between services earlier. We started with HTTP/JSON between services for simplicity. It worked, but the serialisation overhead was measurable at high throughput. When we switched the Ingestor-to-Transformer link to gRPC with Protobuf, throughput on that path increased by ~35% with no other changes.
Write integration tests against real infrastructure. Our unit tests mocked Redis and the message queue. They passed. Production broke. The mock didn't replicate Redis cluster behaviour under connection storms. We added integration tests with Testcontainers and caught three more bugs in the first week.
Go taught me that boring is good. The language doesn't let you be clever — no generics (at the time), no magic, no metaprogramming. You write obvious code that does obvious things, and it runs fast and doesn't break at 3am. Coming from the Node.js ecosystem where every package has an opinion and a DSL, that simplicity was a relief.