# 01 — Docker Swarm Init (Prod — Multi-Node) ## Context - **Repo:** `iklim.co` root - **Environment:** prod - **Topology:** - 3 × app nodes (`iklim-app-01/02/03`) — all act as **Swarm managers AND app workers** (Raft quorum: 1 can fail) - 3 × DB nodes (`iklim-db-01/02/03`) — join Swarm as **workers** with `role=db` label; DB services are placed exclusively on them - **Sizing:** app nodes are `cpx42`, DB nodes are `cpx32`; see `../../hetzner-sizing-report.md` - All 6 nodes are in the same private network. - Pipeline trigger: push to `prod-env` branch → Gitea runner on `prod-runner` (first app node). - App Swarm managers: 3 nodes all manager-eligible and carry app workloads (no dedicated worker-only app nodes). ## Node labeling plan | Node | Role | Swarm role | Labels | |------|------|------------|--------| | `iklim-app-01` | API services, SWAG, Vault | Manager + Worker | `type=service` | | `iklim-app-02` | API services replicas | Manager + Worker | `type=service` | | `iklim-app-03` | API services replicas | Manager + Worker | `type=service` | | `iklim-db-01` | PostgreSQL (Patroni), etcd | Worker | `role=db` | | `iklim-db-02` | PostgreSQL (Patroni), etcd | Worker | `role=db` | | `iklim-db-03` | MongoDB replica + PostgreSQL (Patroni), etcd | Worker | `role=db` | ## Step 1 — Init Swarm on iklim-app-01 (the prod-runner node) ```bash MANAGER_IP=$(hostname -I | awk '{print $1}') if ! docker info --format '{{.Swarm.LocalNodeState}}' | grep -q "active"; then docker swarm init --advertise-addr "$MANAGER_IP" echo "Swarm initialized on $MANAGER_IP" else echo "Swarm already active" fi ``` ## Step 2 — Get manager join token ```bash docker swarm join-token manager # for iklim-app-02, iklim-app-03 ``` Save this token — needed on iklim-app-02 and iklim-app-03. ## Step 3 — Join iklim-app-02 and iklim-app-03 as managers SSH into iklim-app-02 and iklim-app-03, run: ```bash docker swarm join --token 10.20.10.11:2377 ``` ## Step 4 — Label app nodes On iklim-app-01, after iklim-app-02 and iklim-app-03 have joined: ```bash for node in iklim-app-01 iklim-app-02 iklim-app-03; do docker node update --label-add type=service "$node" done ``` ## Step 5 — Join DB nodes as Swarm workers Get the worker join token on iklim-app-01: ```bash docker swarm join-token worker ``` SSH into each DB node and join: ```bash docker swarm join --token 10.20.10.11:2377 ``` Then label them on iklim-app-01: ```bash for node in iklim-db-01 iklim-db-02 iklim-db-03; do docker node update --label-add role=db "$node" done ``` > DB nodes are Swarm **workers** only — they never become managers. > DB services are pinned to them via `node.labels.role == db` placement constraint. > See `08-prod-db-cluster-kurulum.md` for DB stack deployment. ## Step 6 — Verify ```bash docker node ls ``` Expected: 6 nodes — 3 with `MANAGER STATUS` = `Leader` or `Reachable`, 3 workers with `Ready`. ```bash docker node inspect iklim-app-01 --format '{{.Spec.Labels}}' docker node inspect iklim-db-01 --format '{{.Spec.Labels}}' ``` Expected: `map[type:service]` for app nodes, `map[role:db]` for DB nodes. ## Step 7 — Confirm `init/swarm-init.sh` multi-node awareness The script is idempotent (skips init if already active). Verify: ```bash grep -n "swarm init\|swarm join" init/swarm-init.sh ``` The prod pipeline runs on iklim-app-01 only. iklim-app-02/03 are joined via Ansible (`swarm` role), not via the Gitea pipeline. ## Placement constraints used in `docker-stack-infra.yml` | Constraint | Resolves to | |------------|-------------| | `node.role == manager` | iklim-app-01, iklim-app-02, iklim-app-03 | | `node.labels.type == service` | iklim-app-01, iklim-app-02, iklim-app-03 | | `node.labels.role == db` | iklim-db-01, iklim-db-02, iklim-db-03 | SWAG, Vault, cert-reloader: pinned to `node.role == manager`. Microservices: no constraint (distributed across all app nodes by Swarm scheduler). DB services (Patroni, etcd, MongoDB): pinned to `node.labels.role == db` in separate DB stacks.