- Anglicized setup and facts markdown file names for better consistency. - Updated 01-swarm-init-multinode.md to highlight Ansible automation of Swarm initialization and labeling. - Overhauled 03-infra-stack-changes.md to describe the single monolithic file strategy and reflect current Redis, RabbitMQ, and etcd cluster configurations. - Fixed minor overrides and typos in Patroni templates and Ansible bootstrap documents. - Restructured README and roadmap mapping to align with the renamed setup documents.
4.2 KiB
03 — Production Infrastructure and DB Stack Model
Context
This document records the production infrastructure target that is now implemented by the current setup runbooks. The execution source is no longer the old base-plus-prod overlay model.
Current references:
- Setup source:
../../setup/08-prod-db-cluster-setup.mdand../../setup/09-prod-runner-ha-and-swarm.md - Main infra and DB stack: root
docker-stack-infra_db-prod.yml - Vault stack: root
docker-stack-vault.yml - Vault bootstrap: root
init/vault/vault-bootstrap.sh, called throughinit-infra-prod.sh
Current Stack Strategy
Production uses a split stack model:
docker-stack-infra_db-prod.yml: APISIX, APISIX Dashboard, SWAG, cert services, Redis/Sentinel, RabbitMQ, Prometheus, Grafana, Patroni/PostgreSQL, MongoDB, and etcd.docker-stack-vault.yml: Vault Raft cluster only.
The previous docker-stack-infra.yml + docker-stack-infra.prod.yml overlay strategy is superseded for production. Do not create or deploy docker-stack-infra.prod.yml for the current prod environment.
Placement Boundary
docker-stack-infra_db-prod.yml is intentionally a mixed stack. The placement model is the important boundary:
- DB/cluster services run on
iklim-db-*: Patroni/PostgreSQL, MongoDB, and etcd. - App/service-node infrastructure runs on
iklim-app-*withnode.labels.type == service: Redis, Redis Sentinel, RabbitMQ, APISIX, APISIX Dashboard, SWAG, cert-reloader/cert-distributor, Prometheus, and Grafana. - Redis and RabbitMQ are not DB-node host-mode services. They stay on the overlay network unless explicitly exposed by the stack or SWAG/APISIX.
DB services that require direct cluster traffic publish host-mode ports where the current stack defines them. Redis and RabbitMQ must not be changed to host-mode just because they live in the same stack file.
Current Production Services
| Area | Current model |
|---|---|
| APISIX | 3 replicas on service nodes; config stored in etcd with /apisix prefix |
| Redis | Sentinel model on service nodes; overlay-only |
| RabbitMQ | 3-node service-node cluster; management exposed through SWAG, restricted by IP |
| Vault | Separate 3-node Raft stack via docker-stack-vault.yml |
| PostgreSQL | 3-node Patroni cluster on DB nodes |
| MongoDB | 3-node replica set on DB nodes |
| etcd | 3-node cluster on DB nodes, shared by Patroni and APISIX |
| Prometheus | Single instance; local Docker volume |
| Grafana | Single instance; StorageBox-backed data path |
Monitoring Persistence
Prometheus TSDB remains on a local Docker volume because StorageBox/DAVFS is not suitable for Prometheus WAL and compaction I/O.
Grafana uses /mnt/storagebox/grafana/data through GRAFANA_DATA_DIR so dashboards, plugins, and the SQLite database survive manual service movement between service nodes.
APISIX and etcd
APISIX uses the DB-node etcd cluster through overlay DNS aliases such as etcd-01, etcd-02, and etcd-03. Patroni and APISIX use different etcd prefixes, so their data does not collide.
The app subnet to DB subnet firewall rule for etcd client traffic is part of the current production firewall model. See ../../setup/06-prod-terraform-iac.md.
Redis and RabbitMQ
Redis/Sentinel and RabbitMQ are service-node infrastructure. Their placement follows node.labels.type == service.
RabbitMQ-related private firewall rules belong to the app/service-node firewall model. Redis and Sentinel do not publish host-mode ports in the current prod stack and do not require Hetzner firewall openings.
Historical / Superseded by Setup
The following earlier roadmap ideas are retained only as historical context:
- Creating
docker-stack-infra.prod.ymlas a prod overlay. - Deploying prod with
docker stack deploy -c docker-stack-infra.yml -c docker-stack-infra.prod.yml iklimco. - Keeping Vault inside the prod infra overlay with
/opt/iklimco/vault/datahost-path storage. - Treating PostgreSQL/MongoDB as separate DB stacks such as
docker-stack-db.prod.yml. - Validating a prod merge with
docker stack config -c docker-stack-infra.yml -c docker-stack-infra.prod.yml.
For current execution, use the setup runbooks and root stack files listed in the Context section.