# 03 — docker-stack-infra.yml Changes (Prod) ## Context - **File:** `docker-stack-infra.yml` (repo root — shared between test and prod) - All changes from `test-env-setup/03-infra-stack-changes.md` apply here identically. - **Additional prod-specific changes:** - PostgreSQL and MongoDB placement constraints point to `type=db` nodes. - Microservices have no constraint (distributed across service nodes by Swarm). - Replica counts for stateless services are increased. ## Step 1 — Apply all test-env changes first Follow every step in `test-env-setup/03-infra-stack-changes.md`: - Add `swag` service - Add `cert-reloader` service - Remove published ports for vault, apisix, rabbitmq, prometheus, grafana, apisix-dashboard - Add `swag-vl` volume ## Step 2 — Update PostgreSQL placement constraint Change `postgres` service placement to use the `type=db` label: ```yaml # CHANGE in postgres service: placement: constraints: - node.labels.type == db ``` ## Step 3 — Update MongoDB placement constraint ```yaml # CHANGE in mongo service: placement: constraints: - node.labels.type == db ``` ## Step 4 — Pin Vault to manager node (initial prod — single instance) Vault starts as a single instance pinned to the manager node. Raft cluster migration is handled separately in `07-vault-raft-plan.md`. ```yaml # Vault placement stays as: placement: constraints: - node.role == manager ``` ## Step 5 — Increase APISIX replicas for prod ```yaml # CHANGE in apisix service deploy block: mode: replicated replicas: 2 # was 1 ``` APISIX is stateless (config in etcd) — multiple replicas are safe. Swarm load-balances SWAG's requests across APISIX replicas via VIP. ## Step 6 — etcd: 3-node cluster for prod For prod, etcd should run as a 3-node cluster (minimum for Raft quorum). The current single-instance etcd definition needs to be replaced with a 3-node StatefulSet-style setup using separate service definitions or a dedicated `docker-stack-etcd.yml`. > **Scope note:** etcd clustering for prod is complex and out of scope for initial launch. > Deploy with single etcd for initial prod launch. Add etcd clustering as a follow-up task. > Track in: `Technical Debt/TODO.md` ## Step 7 — Verify the complete file After all edits, validate the YAML: ```bash docker stack config -c docker-stack-infra.yml > /dev/null && echo "✅ YAML valid" ``` No output errors = valid. ## Placement summary for prod | Service | Placement | |---------|-----------| | swag | `node.role == manager` | | cert-reloader | `node.role == manager` | | vault | `node.role == manager` | | apisix (2 replicas) | no constraint (any node) | | apisix-dashboard | no constraint | | postgres | `node.labels.type == db` | | mongo | `node.labels.type == db` | | redis | `node.role == manager` | | rabbitmq | `node.role == manager` | | etcd | `node.role == manager` | | prometheus | `node.role == manager` | | grafana | `node.role == manager` |