Production Checklist¶
Use this checklist before going live. Items are grouped by priority.
Required¶
- [ ] External PostgreSQL provisioned and accessible from the cluster
- [ ] Database migrations applied (enabled by default via init container)
- [ ] Database credentials stored in a Kubernetes secret (
volundr-db) - [ ] Anthropic API key stored in a Kubernetes secret (
volundr-anthropic-api) - [ ] Ingress configured with TLS termination
- [ ] Identity adapter set to
EnvoyHeaderIdentityAdapter(notAllowAll) - [ ] Authorization adapter set to
SimpleRoleAuthorizationAdapterorCerbosAuthorizationAdapter - [ ] Storage class supports ReadWriteMany for sessions and home PVCs
Security¶
- [ ] OIDC/OAuth2 configured via Envoy sidecar with JWT validation
- [ ] Gateway API security policies enforce JWT validation on session routes
- [ ] Credential store uses Vault or Infisical (not in-memory)
- [ ] Secret injection uses CSI driver (not in-memory)
- [ ] Git tokens stored in Kubernetes secrets, referenced by
existingSecret - [ ] Pod security context enforces
runAsNonRoot, drops all capabilities - [ ] Container runs with read-only root filesystem
- [ ] CORS origins restricted to your domain (not
*)
Recommended¶
- [ ] HPA enabled with appropriate min/max replicas
- [ ] PDB configured to maintain availability during rollouts
- [ ] NetworkPolicy restricting pod-to-pod traffic
- [ ] Resource requests and limits set for API and session pods
- [ ] Kyverno PVC isolation policy applied (prevents cross-session access)
- [ ] Log format set to
jsonfor structured logging - [ ] Database pool size tuned for replica count (
maxPoolSize * replicaCount <= max_connections)
Monitoring¶
Volundr exposes the following endpoints:
| Endpoint | Purpose |
|---|---|
/health |
Liveness and readiness probes |
/api/v1/volundr/sessions/stream |
SSE stream for real-time session state |
/api/v1/volundr/events/health |
Event pipeline and sink status |
OpenTelemetry traces and metrics are available when an OTel collector is configured.
Scaling¶
- API is stateless. Scale horizontally via HPA.
- SSE connections are per-instance (in-memory broadcaster). Use sticky sessions or a single replica for SSE. For multi-replica SSE, add an external pub/sub layer.
- Skuld brokers run one-per-session. They scale with session count.
- Database pool size should account for all replicas:
maxPoolSize * replicaCountmust not exceed PostgreSQLmax_connections.