Architecture

Design Principles
Self-Documenting APIs
1. Type-Safe by Design
Overview
Package Dependency Graph
Request Lifecycle
1. HTTP → gRPC Translation
Server Interceptor Chain
1. Interceptor Chain Overhead
2. Adding Custom Interceptors
Client Interceptor Chain
Context Propagation
Deployment Topology

Design Principles

ColdBrew follows 12-factor app methodology and is designed to run on Kubernetes from day one:

12-Factor Principle	How ColdBrew Implements It
Config	All configuration via environment variables — works with envconfig or any `env:`-tag loader like caarlos0/env. No config files, no YAML. See Configuration Reference
Port binding	Self-contained HTTP (`:9091`) and gRPC (`:9090`) servers, optional dedicated admin port (`ADMIN_PORT`) for endpoint isolation
Logs	Structured JSON to stdout by default — ready for any log aggregator (Fluentd, Loki, CloudWatch)
Disposability	Graceful SIGTERM handling with configurable drain periods. See Signals
Dev/prod parity	Same binary, same config mechanism, same observability in every environment
Concurrency	Stateless processes — scale horizontally by adding replicas
Backing services	External dependencies (databases, caches, queues) attached via environment variables

ColdBrew is Kubernetes-native: health/ready probe endpoints, Prometheus metrics scraping, graceful pod termination, and structured logging work without any additional setup. See the Production Deployment guide for K8s manifests and configuration.

Self-Documenting APIs

ColdBrew follows a define once, get everything approach. Your .proto file is the single source of truth — one buf generate produces everything your service needs:

                          ┌─── Go protobuf types         (*.pb.go)
                          ├─── gRPC service stubs         (*_grpc.pb.go)
  myservice.proto ──buf──►├─── HTTP/REST gateway handlers (*.gw.go)
                          ├─── OpenAPI/Swagger spec       (*.swagger.json)
                          └─── vtprotobuf fast codec      (*_vtproto.pb.go)

Each output maps to a self-documenting endpoint:

Output	Serves	How Clients Discover It
gRPC stubs	`:9090`	gRPC reflection — `grpcurl -plaintext localhost:9090 list`
HTTP gateway	`:9091/api/...`	Swagger UI at `/swagger/`
OpenAPI spec	`:9091/swagger/*.swagger.json`	Import into Postman, code generators, or API gateways
Health/version	`:9091/healthcheck`	Returns git commit, version, build date, Go version as JSON
Metrics	`:9091/metrics`	Prometheus self-describing exposition format with HELP lines
Profiling	`:9091/debug/pprof/`	Standard Go pprof index page

Tip: Set ADMIN_PORT to serve metrics, profiling, and swagger on a dedicated port. Health and readiness endpoints remain on :9091. See Security hardening.

Every client gets documentation for free:

gRPC clients use server reflection to discover services and methods without proto files
REST clients use the interactive Swagger UI or import the OpenAPI spec
Operations use health checks (build metadata), Prometheus metrics, and pprof

The HTTP annotations in your proto file define both the REST routes and their Swagger documentation simultaneously:

rpc Echo(EchoRequest) returns (EchoResponse) {
    option (google.api.http) = {
        post: "/api/v1/example/echo"
        body: "*"
    };
    option (grpc.gateway.protoc_gen_openapiv2.options.openapiv2_operation) = {
        summary: "Echo endpoint"
        description: "Returns the input message unchanged."
        tags: "example"
    };
}

This creates: a gRPC method, a POST /api/v1/example/echo REST endpoint, and a documented Swagger UI entry — all from one definition.

Type-Safe by Design

buf generate produces typed Go interfaces from your proto service definitions. When you add a new RPC method to your .proto file and regenerate, the Go compiler will refuse to build until you implement it — there’s no way to forget an endpoint or deploy a half-implemented API.

myservice.proto          buf generate         Go compiler
─────────────── ──────────────────────► ─────────────────
rpc Echo(...)           EchoServer interface   ✓ Implemented
rpc Greet(...)          GreetServer interface  ✗ Build error until implemented

This means your proto file is the contract — the compiler enforces it, grpc-gateway serves it as REST, and the OpenAPI spec documents it. They can never drift from each other.

Overview

ColdBrew is a layered framework where each layer is an independent Go module. The core package orchestrates everything, but you can use any package standalone.

Package Dependency Graph

                    ┌──────────────────┐
                    │       core       │
                    └────────┬─────────┘
                             │
                    ┌────────┴─────────┐
               ┌────┤   interceptors   ├────┐
               │    └────────┬─────────┘    │
               │             │              │
        ┌──────┴──────┐  ┌──┴───┐  ┌───────┴───────┐
        │ data-builder │  │ grpc │  │    tracing     │
        └──────┬──────┘  │ pool │  └───────┬───────┘
               │         └──┬───┘          │
               │            │              │
               │         ┌──┴───┐          │
               │         │  log ├──────────┘
               │         └──┬───┘
               │            │
               │      ┌─────┴──────┐
               │      │   errors   │
               │      └─────┬──────┘
               │            │
               └────────┬───┘
                   ┌────┴─────┐
                   │  options  │
                   └──────────┘

Dependency order (bottom to top):

options → errors → log → tracing → grpcpool → interceptors → data-builder → core

Each package only depends on packages below it. This means you can use errors without pulling in tracing, or use log without needing interceptors.

Request Lifecycle

When a request arrives at a ColdBrew service, it flows through several layers:

  Client Request
       │
       ▼
  ┌─────────────────────────────────────────────────┐
  │                   ColdBrew Core                  │
  │                                                  │
  │  ┌──────────────┐       ┌──────────────┐        │
  │  │  HTTP Gateway │       │  gRPC Server │        │
  │  │  (grpc-gw)   │──────►│  (port 9090) │        │
  │  │  (port 9091) │       └──────┬───────┘        │
  │  └──────────────┘              │                 │
  │                                ▼                 │
  │  ┌──────────────────────────────────────────┐   │
  │  │          Server Interceptor Chain         │   │
  │  │                                           │   │
  │  │   1. Default Timeout (60s deadline)       │   │
  │  │   2. Rate Limiting (token bucket)         │   │
  │  │   3. Response Time Logging                │   │
  │  │   4. Trace ID Injection                   │   │
  │  │   5. Debug Log (per-request level)        │   │
  │  │   6. Proto Validate                       │   │
  │  │   7. Prometheus Metrics                   │   │
  │  │   8. Error Notification (Sentry/Rollbar)  │   │
  │  │   9. New Relic Transaction                │   │
  │  │  10. Panic Recovery                       │   │
  │  │  (OTEL tracing via gRPC stats handler)    │   │
  │  │                                           │   │
  │  └────────────────────┬─────────────────────┘   │
  │                       │                          │
  │                       ▼                          │
  │              ┌─────────────────┐                 │
  │              │  Your Handler   │                 │
  │              │  (service.go)   │                 │
  │              └─────────────────┘                 │
  │                                                  │
  │  Built-in Endpoints:                             │
  │    /healthcheck    - Liveness probe              │
  │    /readycheck     - Readiness probe             │
  │                                                  │
  │  Admin Endpoints (movable to ADMIN_PORT):        │
  │    /metrics        - Prometheus                  │
  │    /debug/pprof/   - Go profiling                │
  │    /swagger/       - OpenAPI docs                │
  └─────────────────────────────────────────────────┘

HTTP → gRPC Translation

HTTP requests arriving at port 9091 are automatically translated to gRPC calls by grpc-gateway. The translation rules are defined in your .proto file via google.api.http annotations:

rpc Echo(EchoRequest) returns (EchoResponse) {
    option (google.api.http) = {
        post: "/api/v1/example/echo"
        body: "*"
    };
}

This means POST /api/v1/example/echo on port 9091 is translated to a gRPC call to Echo() on port 9090. The response is converted back to JSON automatically.

Server Interceptor Chain

Interceptors are gRPC middleware that run on every request. ColdBrew chains them in this order:

Order	Interceptor	Package	What It Does
1	Default Timeout	`interceptors`	Applies a 60s deadline to unary RPCs without one. Prevents resource exhaustion from clients that don’t set deadlines. Config: `GRPC_SERVER_DEFAULT_TIMEOUT_IN_SECONDS`
2	Rate Limiting	`interceptors`	Per-pod token bucket rate limiter. Returns `ResourceExhausted` when exceeded. Disabled by default. Config: `RATE_LIMIT_PER_SECOND`, `RATE_LIMIT_BURST`
3	Response Time Logging	`interceptors`	Logs method name, duration, and status code
4	Trace ID	`interceptors`	Generates a trace ID (or reads it from the `x-trace-id` HTTP header or a `trace_id` proto field) and propagates it to structured logs, Sentry/Rollbar error reports, and OpenTelemetry spans (as the `coldbrew.trace_id` attribute)
5	Debug Log	`interceptors`	Enables per-request log level override via `bool debug` or `bool enable_debug` proto field, or `x-debug-log-level` metadata header. Config: `DISABLE_DEBUG_LOG_INTERCEPTOR`, `DEBUG_LOG_HEADER_NAME`
6	Proto Validate	`interceptors`	Validates incoming messages using protovalidate annotations. Returns `InvalidArgument` on failure. Config: `DISABLE_PROTO_VALIDATE`
7	Prometheus	`interceptors`	Records request count, latency histogram, and status codes
8	Error Notification	`interceptors`	Sends errors to Sentry/Rollbar/Airbrake asynchronously
9	New Relic	`interceptors`	Creates a New Relic transaction for APM
10	Panic Recovery	`interceptors`	Catches panics and converts them to gRPC errors

OpenTelemetry tracing spans are created by the otelgrpc stats handler configured at the gRPC server/client level, not as an interceptor in the chain.

Health checks, ready checks, and gRPC reflection are excluded by default via FilterMethods. This prevents observability noise from Kubernetes probes. See the FAQ for how to customize this.

Interceptor Chain Overhead

The full interceptor chain adds ~10–12% overhead compared to bare gRPC (no interceptors). Most of that overhead comes from per-request log writes (I/O), not the interceptor framework itself. Setting RESPONSE_TIME_LOG_ERROR_ONLY=true closes most of the gap (see Tuned row below).

End-to-end throughput measured on Apple M1 Pro (loopback, ghz load test, simple Echo handler):

Configuration	RPS @ c=1	RPS @ c=50	RPS @ c=200	Avg @ c=1	P99 @ c=200
Default (all interceptors)	5,500	40,900	50,000	0.12ms	7.9ms
Tuned (error-only logging, no histograms)	6,300	42,700	53,200	0.10ms	7.3ms
No interceptors (bare gRPC)	7,000	46,600	55,800	0.09ms	7.2ms

Per-interceptor micro-benchmark: ~4.1µs, ~1.5KB, ~37 allocs per unary request. Profile with:

go test -run='^$' -bench=BenchmarkDefaultInterceptors -benchmem ./...

The tuned configuration uses RESPONSE_TIME_LOG_ERROR_ONLY=true and ENABLE_PROMETHEUS_GRPC_HISTOGRAM=false. See the Configuration Reference for the full set of tuning knobs.

Adding Custom Interceptors

You can prepend your own interceptors to the chain:

func init() {
    interceptors.AddUnaryServerInterceptor(context.Background(), myCustomInterceptor)
}

Interceptor configuration must happen during init(). These functions are not safe for concurrent use.

Client Interceptor Chain

When your service calls other gRPC services, ColdBrew applies client-side interceptors:

Interceptor	What It Does
Hystrix	Circuit breaking (deprecated — consider failsafe-go)
Retry	Automatic retries with backoff

Trace context propagation to downstream services is handled by the otelgrpc client stats handler, not a chain interceptor.

Context Propagation

ColdBrew uses context.Context to propagate metadata through every layer:

  context.Context
       │
       ├── options (key-value metadata)
       │     Set: options.AddToOptions(ctx, key, value)
       │     Get: options.FromContext(ctx).Get(key)
       │
       ├── log fields (per-request structured logging)
       │     Add: log.AddToContext(ctx, key, value)
       │     Used by: interceptors, your handlers
       │
       ├── trace span (distributed tracing)
       │     Create: tracing.NewInternalSpan(ctx, "operation")
       │     Propagated by: OTEL gRPC stats handler
       │
       └── trace ID (request correlation)
             Injected by: Trace ID interceptor
             Available in: log output, error reports, OpenTelemetry spans (`coldbrew.trace_id` attribute)

Every interceptor reads from and writes to the context. By the time the request reaches your handler, the context carries:

A unique trace ID for log correlation
An active tracing span for distributed tracing
Options set by upstream services
Log fields added by interceptors

Deployment Topology

A typical ColdBrew service exposes two ports (optionally three with ADMIN_PORT):

  ┌─────────────────────────────────────────┐
  │           ColdBrew Service              │
  │                                         │
  │   Port 9090 (gRPC)                      │
  │   ├── Your gRPC service                 │
  │   ├── Health.Check (grpc health v1)     │
  │   └── ServerReflection                  │
  │                                         │
  │   Port 9091 (HTTP)                      │
  │   ├── /api/...     (REST gateway)       │
  │   ├── /healthcheck (liveness probe)     │
  │   ├── /readycheck  (readiness probe)    │
  │   ├── /metrics     (Prometheus)*        │
  │   ├── /swagger/    (OpenAPI UI)*        │
  │   └── /debug/pprof/ (profiling)*        │
  │                                         │
  │   ADMIN_PORT (optional, e.g. 9092)      │
  │   ├── /metrics     (Prometheus)         │
  │   ├── /swagger/    (OpenAPI UI)         │
  │   └── /debug/pprof/ (profiling)         │
  └─────────────────────────────────────────┘

  * When ADMIN_PORT is set, these endpoints move
    to the admin port and are removed from 9091.

Kubernetes Integration

ColdBrew is designed for Kubernetes deployments:

Liveness probe: GET /healthcheck — returns build/version info as JSON (git commit, version, build date, Go version, OS/arch)
Readiness probe: GET /readycheck — returns the same version JSON when ready for traffic, or an error if the service hasn’t called SetReady() yet
gRPC health protocol: Implements grpc.health.v1.Health (standard gRPC health checking) on the gRPC port — used by gRPC load balancers, Envoy, Istio, and other service meshes for native health checking
Graceful shutdown: On SIGTERM, the service marks itself as not ready, drains in-flight requests, then exits cleanly
Metrics scraping: Prometheus scrapes /metrics on the HTTP port (or ADMIN_PORT when configured)

Gateway Performance Options

By default, the HTTP gateway connects to the gRPC server via TCP loopback (localhost:9090). Two options are available for lower latency:

Option 1: Unix domain socket (opt-in, zero code changes)

Set DISABLE_UNIX_GATEWAY=false to route the gateway’s internal connection through a Unix socket. This reduces gateway-to-gRPC latency from ~67µs to ~36µs (1.9x improvement) by bypassing TCP overhead. The TCP gRPC port remains available for external clients. If socket creation fails, the gateway silently falls back to TCP.

When gRPC TLS is configured (GRPC_TLS_CERT_FILE + GRPC_TLS_KEY_FILE), the unix socket is automatically skipped — grpc.Server applies TLS to all listeners, and the gateway falls back to TCP with proper TLS credentials.

Option 2: In-process gateway via DoHTTPtoGRPC (maximum performance)

For zero network hop, use RegisterHandlerServer instead of RegisterHandlerFromEndpoint in your InitHTTP, and wrap each gRPC method with interceptors.DoHTTPtoGRPC(). This calls the gRPC handler in-process while preserving the full interceptor chain (logging, tracing, metrics, panic recovery). Requires a per-method wrapper but eliminates all network overhead.

func (s *svc) Echo(ctx context.Context, req *proto.EchoRequest) (*proto.EchoResponse, error) {
    handler := func(ctx context.Context, req interface{}) (interface{}, error) {
        return s.echo(ctx, req.(*proto.EchoRequest))
    }
    r, err := interceptors.DoHTTPtoGRPC(ctx, s, handler, req)
    if err != nil {
        return nil, err
    }
    return r.(*proto.EchoResponse), nil
}

Approach	Latency	Code changes	Trade-offs
TCP loopback (default)	~67µs	None	Simplest, most compatible
Unix socket	~36µs	None (config only)	1.9x faster, opt-in
`DoHTTPtoGRPC`	~19µs	Per-method wrapper	Fastest, requires code changes

Startup Sequence

Configuration loaded from environment variables (core.New(cfg))
Interceptor chain assembled during init() (not thread-safe)
PreStart(ctx) on services implementing CBPreStarter — DB connections, schema migrations, programmatic interceptor configuration. Returning an error aborts startup.
gRPC server created, service registers handlers (InitGRPC)
Unix socket created for gateway (if DISABLE_UNIX_GATEWAY=false)
HTTP server created, service registers handlers (InitHTTP)
gRPC and HTTP servers start listening concurrently
Admin server starts (if ADMIN_PORT is set)
PostStart(ctx) on services implementing CBPostStarter — service-discovery registration, post-startup metrics emission
Service marks itself as ready (SetReady()) — /readycheck starts succeeding
Server blocks until shutdown signal

Shutdown Sequence

SIGTERM/SIGINT received
PreStop(ctx) on services implementing CBPreStopper — deregister from service discovery, flush buffers
FailCheck(true) on CBGracefulStopper services — /readycheck starts failing
Wait GRPC_GRACEFUL_DURATION_IN_SECONDS (default 7s) for load balancer to drain
Cancel worker context, wait for workers to exit
Shutdown admin server (if configured)
Shutdown HTTP server (stop accepting new requests)
GracefulStop() gRPC server (finish in-flight RPCs, reject new ones)
Force-stop gRPC server if graceful shutdown didn’t complete in time
Stop() called on CBStopper services (your cleanup logic)
PostStop(ctx) on services implementing CBPostStopper — final cleanup, audit log close
Exit

See Signal Handling and Graceful Shutdown for the full lifecycle interface table, configuration, and tuning details.