Configuration Reference

Table of contents

  1. Server
  2. Logging
  3. gRPC Server
  4. gRPC TLS
  5. gRPC Keepalive
  6. HTTP Gateway
  7. Prometheus Metrics
  8. New Relic
  9. OpenTelemetry (OTLP)
  10. Error Tracking
  11. Graceful Shutdown
  12. Response Time Logging
  13. Runtime
  14. Deprecated
  15. Example: Minimal Production Configuration
  16. Example: Local Development with Jaeger (via OTLP)
  17. Example: High-Throughput Production
    1. Measured tuning impact

ColdBrew is configured entirely through environment variables using envconfig. All fields have sensible defaults — you can run a service with zero configuration.

Access the config in code via:

cfg := config.GetColdBrewConfig()

Server

Variable Type Default Description
LISTEN_HOST string 0.0.0.0 Host address to listen on
GRPC_PORT int 9090 gRPC server port
HTTP_PORT int 9091 HTTP gateway port
ADMIN_PORT int 0 (disabled) Dedicated port for admin endpoints (pprof, metrics, swagger). When set to a non-zero value, these endpoints are served on this port instead of HTTP_PORT, enabling network-level isolation via Kubernetes NetworkPolicy. See Security hardening
APP_NAME string "" Application name (used in logs, metrics, New Relic)
ENVIRONMENT string "" Environment name (e.g., production, staging, development)
RELEASE_NAME string "" Release/version name

Logging

Variable Type Default Description
LOG_LEVEL string info Log level: debug, info, warn, error. For per-request debugging, use log.OverrideLogLevel(ctx, loggers.DebugLevel) — combined with the trace ID, this lets you enable debug logging for a single request and follow it across services. See Log How-To
JSON_LOGS bool true Emit logs in JSON format

gRPC Server

Variable Type Default Description
DISABLE_GRPC_REFLECTION bool false Disable gRPC server reflection (used by tools like grpcurl)
DO_NOT_LOG_GRPC_REFLECTION bool true Suppress logging of gRPC reflection API calls
GRPC_MAX_SEND_MSG_SIZE int 2147483647 Maximum response size in bytes — limits how large a response your service can send back to clients (default: ~2GB). Consider reducing for public-facing services; use streaming RPCs for large payloads
GRPC_MAX_RECV_MSG_SIZE int 4194304 Maximum request size in bytes — limits how large a request your service accepts from clients (default: 4MB)
DISABLE_VT_PROTOBUF bool false Disable vtprotobuf marshaller for gRPC. See vtprotobuf guide
DISABLE_PROTO_VALIDATE bool false Disable protovalidate interceptor. When disabled, proto validation annotations are ignored
DISABLE_DEBUG_LOG_INTERCEPTOR bool false Disable the DebugLogInterceptor. When disabled, proto debug/enable_debug fields and x-debug-log-level headers will not trigger per-request debug logging
DEBUG_LOG_HEADER_NAME string x-debug-log-level gRPC metadata / HTTP header name for per-request debug logging. The header value should be a valid log level (debug, info, warn, error). See Log How-To
GRPC_SERVER_DEFAULT_TIMEOUT_IN_SECONDS int 60 Default timeout for incoming unary gRPC requests without a deadline. Set to 0 to disable. Does not apply to stream RPCs
RATE_LIMIT_PER_SECOND float64 0 Maximum incoming requests per second for this pod (per-pod in-memory token bucket). Set to 0 to disable (default). With N pods, effective cluster-wide limit is N × this value. For distributed rate limiting, use interceptors.SetRateLimiter() with a custom implementation
RATE_LIMIT_BURST int 1 Maximum burst size for the token bucket rate limiter. Only takes effect when RATE_LIMIT_PER_SECOND > 0
DISABLE_RATE_LIMIT bool false Disable the rate limiting interceptor entirely

gRPC TLS

Variable Type Default Description
GRPC_TLS_KEY_FILE string "" Path to TLS private key file. Both key and cert must be set to enable TLS
GRPC_TLS_CERT_FILE string "" Path to TLS certificate file. Both key and cert must be set to enable TLS
GRPC_TLS_INSECURE_SKIP_VERIFY bool false Skip TLS certificate verification (development only)

gRPC Keepalive

Variable Type Default Description
GRPC_SERVER_MAX_CONNECTION_IDLE_IN_SECONDS int 300 Close idle connections after this duration. Set to -1 to disable this limit
GRPC_SERVER_MAX_CONNECTION_AGE_IN_SECONDS int 1800 Maximum connection lifetime with ±10% jitter. Set to -1 to disable this limit
GRPC_SERVER_MAX_CONNECTION_AGE_GRACE_IN_SECONDS int 30 Grace period after max connection age before force-closing. Set to -1 to disable this limit

To allow connections to remain open indefinitely, set both GRPC_SERVER_MAX_CONNECTION_IDLE_IN_SECONDS and GRPC_SERVER_MAX_CONNECTION_AGE_IN_SECONDS to -1.

HTTP Gateway

Variable Type Default Description
DISABLE_SWAGGER bool false Disable Swagger UI at the swagger URL
SWAGGER_URL string /swagger/ URL path for Swagger UI
DISABLE_DEBUG bool false Disable pprof debug endpoints at /debug/
USE_JSON_BUILTIN_MARSHALLER bool false Use encoding/json instead of the default protojson marshaller for application/json
JSON_BUILTIN_MARSHALLER_MIME string application/json Content-Type for the JSON builtin marshaller
HTTP_HEADER_PREFIXES []string "" HTTP header prefixes to forward as gRPC metadata (comma-separated)
TRACE_HEADER_NAME string x-trace-id HTTP header name for trace ID propagation to log/trace contexts
DISABLE_HTTP_COMPRESSION bool false Disable gzip/zstd compression for HTTP gateway responses
HTTP_COMPRESSION_MIN_SIZE int 256 Minimum response body size (bytes) before compression is applied. Responses smaller than this are sent uncompressed
DISABLE_UNIX_GATEWAY bool true Disable Unix domain socket for HTTP gateway’s internal gRPC connection. Set to false to enable (~1.9x faster than TCP loopback). Ignored when gRPC TLS is configured. See Gateway Performance Options

Prometheus Metrics

Variable Type Default Description
DISABLE_PROMETHEUS bool false Disable Prometheus metrics endpoint at /metrics
ENABLE_PROMETHEUS_GRPC_HISTOGRAM bool true Enable gRPC request latency histograms (grpc_server_handling_seconds). When false, latency percentile queries and alerts stop working — only counters remain
PROMETHEUS_GRPC_HISTOGRAM_BUCKETS []float64 "" Custom histogram buckets (comma-separated seconds, e.g., 0.005,0.01,0.025,0.05,0.1,0.25,0.5,1,2.5,5,10)

New Relic

Variable Type Default Description
NEW_RELIC_LICENSE_KEY string "" New Relic license key (required to enable New Relic)
NEW_RELIC_APPNAME string "" Application name in New Relic
DISABLE_NEW_RELIC bool false Disable all New Relic reporting. Note: automatically set to true at startup when NEW_RELIC_LICENSE_KEY is empty, so the effective default for services without a license key is true
NEW_RELIC_DISTRIBUTED_TRACING bool true Enable New Relic distributed tracing
NEW_RELIC_OPENTELEMETRY bool true Enable New Relic via OpenTelemetry
NEW_RELIC_OPENTELEMETRY_SAMPLE float64 0.1 Trace sampling ratio for New Relic OpenTelemetry (0.0–1.0)

OpenTelemetry (OTLP)

When OTLP_ENDPOINT is set, it takes precedence over New Relic OpenTelemetry configuration.

Variable Type Default Description
OTLP_ENDPOINT string "" OTLP gRPC endpoint (e.g., localhost:4317, api.honeycomb.io:443)
OTLP_HEADERS string "" Custom headers as key=value pairs (comma-separated, e.g., x-honeycomb-team=your-key)
OTLP_COMPRESSION string gzip Compression type: gzip or none
OTLP_INSECURE bool false Disable TLS for OTLP connection (development only)
OTLP_SAMPLING_RATIO float64 0.1 Trace sampling ratio (0.0–1.0, where 1.0 = sample all)
OTLP_USE_OPENTRACING_BRIDGE bool false Deprecated. Ignored — OpenTracing bridge has been removed. If set to true, a warning is logged at startup
OTEL_USE_LEGACY_INSTRUMENTATION bool false Revert to legacy otelgrpc-based gRPC OpenTelemetry instrumentation. Set to true only for rollback
ENABLE_OTEL_METRICS bool false Enable OpenTelemetry metrics export via OTLP alongside Prometheus. Does not replace Prometheus
OTEL_METRICS_INTERVAL int 60 Export interval in seconds for OTEL metrics (only applies when ENABLE_OTEL_METRICS=true)

Error Tracking

Variable Type Default Description
SENTRY_DSN string "" Sentry DSN for error notification

Graceful Shutdown

Variable Type Default Description
DISABLE_SIGNAL_HANDLER bool false Disable ColdBrew’s SIGINT/SIGTERM handler
SHUTDOWN_DURATION_IN_SECONDS int 15 Time to wait for in-flight requests to complete before forced shutdown
GRPC_GRACEFUL_DURATION_IN_SECONDS int 7 Time to wait for healthcheck failure to propagate before initiating shutdown. Should be less than SHUTDOWN_DURATION_IN_SECONDS

Response Time Logging

Variable Type Default Description
RESPONSE_TIME_LOG_LEVEL string info Log level for per-request response time logging. Valid: debug, info, warn, error. Invalid values fall back to info. Must be >= LOG_LEVEL to take effect
RESPONSE_TIME_LOG_ERROR_ONLY bool false When true, only log response time for requests that return an error. Successful requests are not logged. Note: if LOG_LEVEL is set higher than RESPONSE_TIME_LOG_LEVEL, response time logs are already suppressed

Runtime

Variable Type Default Description
DISABLE_AUTO_MAX_PROCS bool false Disable automatic GOMAXPROCS tuning (useful if your container runtime already sets it)

Deprecated

Variable Replacement Notes
HTTP_HEADER_PREFIX HTTP_HEADER_PREFIXES Single prefix replaced by comma-separated list
DISABLE_PORMETHEUS DISABLE_PROMETHEUS Typo variant — both work, use the correct spelling
OTLP_USE_OPENTRACING_BRIDGE Remove OpenTracing bridge has been removed — this field is now ignored (logs a warning if set to true)

Example: Minimal Production Configuration

export APP_NAME=myservice
export ENVIRONMENT=production
export LOG_LEVEL=info
export NEW_RELIC_LICENSE_KEY=your-key
export NEW_RELIC_APPNAME=myservice
export SENTRY_DSN=https://your-dsn@sentry.io/123

Example: Local Development with Jaeger (via OTLP)

export APP_NAME=myservice
export ENVIRONMENT=development
export LOG_LEVEL=debug
export OTLP_ENDPOINT=localhost:4317
export OTLP_INSECURE=true
export OTLP_SAMPLING_RATIO=1.0
export DISABLE_NEW_RELIC=true

Example: High-Throughput Production

For services at 70k+ QPS where observability overhead matters:

export APP_NAME=myservice
export ENVIRONMENT=production
export LOG_LEVEL=warn
export RESPONSE_TIME_LOG_LEVEL=warn            # must be >= LOG_LEVEL to take effect
export RESPONSE_TIME_LOG_ERROR_ONLY=true       # skip per-request logging for successful RPCs
# export OTLP_ENDPOINT=your-collector:4317     # uncomment if using OTLP tracing
export OTLP_SAMPLING_RATIO=0.05                # only applies when OTLP_ENDPOINT is set
export ENABLE_PROMETHEUS_GRPC_HISTOGRAM=false   # see warning below
export DISABLE_NEW_RELIC=true
export DISABLE_UNIX_GATEWAY=false              # Unix socket for HTTP gateway (1.9x faster)
export HTTP_COMPRESSION_MIN_SIZE=512

Measured tuning impact

End-to-end throughput on Apple M1 Pro (loopback, ghz load test, simple Echo handler):

Configuration RPS @ c=200 P99 @ c=200 Change
Default (all interceptors, info logging) 50,000 7.9ms baseline
Tuned (above config) 53,200 7.3ms +6% RPS
No interceptors (bare gRPC) 55,800 7.2ms +12% RPS

Most of the interceptor chain overhead comes from per-request log writes. Setting RESPONSE_TIME_LOG_ERROR_ONLY=true closes most of the gap. See the Architecture page for the full breakdown.

Setting ENABLE_PROMETHEUS_GRPC_HISTOGRAM=false removes the grpc_server_handling_seconds metric entirely. This means latency percentile queries and alerts (e.g., histogram_quantile(0.99, ...)) will stop working. You will still have grpc_server_handled_total (request count by status code) and grpc_server_started_total (request count started). Only disable histograms if you have an alternative latency signal (e.g., distributed tracing percentiles, or an external load balancer metric).


Source: core/config/config.go