99 lines
3.0 KiB
Markdown
99 lines
3.0 KiB
Markdown
# 03 — docker-stack-infra.yml Changes (Prod)
|
|
|
|
## Context
|
|
- **File:** `docker-stack-infra.yml` (repo root — shared between test and prod)
|
|
- All changes from `test-env-setup/03-infra-stack-changes.md` apply here identically.
|
|
- **Additional prod-specific changes:**
|
|
- PostgreSQL and MongoDB placement constraints point to `type=db` nodes.
|
|
- Microservices have no constraint (distributed across service nodes by Swarm).
|
|
- Replica counts for stateless services are increased.
|
|
|
|
## Step 1 — Apply all test-env changes first
|
|
|
|
Follow every step in `test-env-setup/03-infra-stack-changes.md`:
|
|
- Add `swag` service
|
|
- Add `cert-reloader` service
|
|
- Remove published ports for vault, apisix, rabbitmq, prometheus, grafana, apisix-dashboard
|
|
- Add `swag-vl` volume
|
|
|
|
## Step 2 — Update PostgreSQL placement constraint
|
|
|
|
Change `postgres` service placement to use the `type=db` label:
|
|
|
|
```yaml
|
|
# CHANGE in postgres service:
|
|
placement:
|
|
constraints:
|
|
- node.labels.type == db
|
|
```
|
|
|
|
## Step 3 — Update MongoDB placement constraint
|
|
|
|
```yaml
|
|
# CHANGE in mongo service:
|
|
placement:
|
|
constraints:
|
|
- node.labels.type == db
|
|
```
|
|
|
|
## Step 4 — Pin Vault to manager node (initial prod — single instance)
|
|
|
|
Vault starts as a single instance pinned to the manager node.
|
|
Raft cluster migration is handled separately in `07-vault-raft-plan.md`.
|
|
|
|
```yaml
|
|
# Vault placement stays as:
|
|
placement:
|
|
constraints:
|
|
- node.role == manager
|
|
```
|
|
|
|
## Step 5 — Increase APISIX replicas for prod
|
|
|
|
```yaml
|
|
# CHANGE in apisix service deploy block:
|
|
mode: replicated
|
|
replicas: 2 # was 1
|
|
```
|
|
|
|
APISIX is stateless (config in etcd) — multiple replicas are safe.
|
|
Swarm load-balances SWAG's requests across APISIX replicas via VIP.
|
|
|
|
## Step 6 — etcd: 3-node cluster for prod
|
|
|
|
For prod, etcd should run as a 3-node cluster (minimum for Raft quorum).
|
|
The current single-instance etcd definition needs to be replaced with a 3-node
|
|
StatefulSet-style setup using separate service definitions or a dedicated
|
|
`docker-stack-etcd.yml`.
|
|
|
|
> **Scope note:** etcd clustering for prod is complex and out of scope for initial launch.
|
|
> Deploy with single etcd for initial prod launch. Add etcd clustering as a follow-up task.
|
|
> Track in: `Technical Debt/TODO.md`
|
|
|
|
## Step 7 — Verify the complete file
|
|
|
|
After all edits, validate the YAML:
|
|
|
|
```bash
|
|
docker stack config -c docker-stack-infra.yml > /dev/null && echo "✅ YAML valid"
|
|
```
|
|
|
|
No output errors = valid.
|
|
|
|
## Placement summary for prod
|
|
|
|
| Service | Placement |
|
|
|---------|-----------|
|
|
| swag | `node.role == manager` |
|
|
| cert-reloader | `node.role == manager` |
|
|
| vault | `node.role == manager` |
|
|
| apisix (2 replicas) | no constraint (any node) |
|
|
| apisix-dashboard | no constraint |
|
|
| postgres | `node.labels.type == db` |
|
|
| mongo | `node.labels.type == db` |
|
|
| redis | `node.role == manager` |
|
|
| rabbitmq | `node.role == manager` |
|
|
| etcd | `node.role == manager` |
|
|
| prometheus | `node.role == manager` |
|
|
| grafana | `node.role == manager` |
|