Murat ÖZDEMİR 76f87aa2f9 Integrate DB nodes into Swarm and refine prod service deployment
- Database nodes now join the Docker Swarm as workers with `role=db` labels, allowing Swarm to manage their dedicated services.
- The `docker-stack-infra.yml` has been updated for production to focus solely on application-level infrastructure components.
- Dedicated database services (PostgreSQL, MongoDB, Patroni-etcd) are now explicitly deployed in separate Swarm stacks on `iklim-db-XX` nodes.
- Standardizes node naming conventions (`iklim-app-XX`, `iklim-db-XX`) across the production roadmap documentation.
- Clarifies that the `etcd` service within `docker-stack-infra.yml` is exclusively for APISIX configuration, distinct from Patroni's etcd cluster.
2026-05-11 14:53:21 +03:00

3.8 KiB

07 — Vault: Initial Single Instance + Raft Cluster Migration Plan (Prod)

Context

Vault starts as a single instance on the manager node (iklim-app-01) for the initial prod launch. This matches the current docker-stack-infra.yml configuration (file storage, single replica).

Raft HA cluster is planned for a later phase.

Phase 1 — Initial prod launch (current)

  • Replicas: 1
  • Storage: file (/vault/file) on iklim-app-01
  • Placement: node.role == manager (iklim-app-01)
  • Cert: from /opt/iklimco/ssl/ (populated by cert-reloader from SWAG volume)
  • TLS: VAULT_LOCAL_CONFIG unchanged — api_addr: https://vault.iklim.co:8200

No changes to docker-stack-infra.yml vault service for Phase 1.

Phase 2 — Vault Raft Cluster (future)

What changes

  • Replicas: 3 (one per service node)
  • Storage: Raft integrated (replaces file storage)
  • Placement: node.labels.type == service (all 3 service nodes)
  • Cert distribution: cert-reloader SSH-copies renewed cert to iklim-app-02, iklim-app-03

Prerequisites before migration

  • All 3 service nodes are running and labeled type=service
  • Vault data backed up from Phase 1 (snapshot via vault operator raft snapshot save)
  • SSH key created for cert-reloader to reach iklim-app-02 and iklim-app-03
  • SSH key stored as Docker secret cert_reloader_ssh_key
  • /opt/iklimco/ssl/ directory exists on iklim-app-02 and iklim-app-03
  • Vault data directory /opt/iklimco/vault/data/ exists on all 3 nodes (host path volumes)

Vault service update for Raft

vault:
  # ... (image, secrets, healthcheck unchanged)
  environment:
    VAULT_LOCAL_CONFIG: >-
      {"api_addr":"https://vault.iklim.co:8200",
       "cluster_addr":"https://{{ .Node.Hostname }}:8201",
       "storage":{"raft":{"path":"/vault/file","node_id":"{{ .Node.Hostname }}"}},
       "listener":[{"tcp":{"address":"0.0.0.0:8200",
         "tls_cert_file":"/vault/certs/STAR.iklim.co.full.crt",
         "tls_key_file":"/vault/certs/STAR.iklim.co_key.txt"}}],
       "default_lease_ttl":"168h","max_lease_ttl":"720h","ui":true}
  volumes:
    - /opt/iklimco/vault/data:/vault/file    # host path per node
    - /opt/iklimco/ssl:/vault/certs:ro
  deploy:
    mode: replicated
    replicas: 3
    placement:
      constraints:
        - node.labels.type == service

{{ .Node.Hostname }} is Docker Swarm's Go template for the node hostname — gives each Vault instance a unique node_id.

Raft join procedure (after deploying 3-replica Vault)

Only the leader needs to be bootstrapped; others join via vault operator raft join:

# On the primary Vault (iklim-app-01 container):
VAULT_CTR=$(docker ps -q -f name=iklimco_vault)

# Unseal if needed
docker exec -it "$VAULT_CTR" vault operator unseal

# Check Raft peers
docker exec "$VAULT_CTR" vault operator raft list-peers

On iklim-app-02 and iklim-app-03 containers:

docker exec -it <vault-on-iklim-app-02> vault operator raft join \
  https://vault.iklim.co:8200

cert-reloader update for Raft

Update the cert-reloader command in docker-stack-infra.yml to SSH-copy the cert to iklim-app-02 and iklim-app-03 after renewal:

# After copying to local /opt/iklimco/ssl/:
ssh -i /run/secrets/cert_reloader_ssh_key iklim-app-02 \
  "cp /dev/stdin /opt/iklimco/ssl/STAR.iklim.co.full.crt" < /opt/iklimco/ssl/STAR.iklim.co.full.crt
# (repeat for iklim-app-03 and privkey)
docker service update --force iklimco_vault

Add Docker secret to cert-reloader:

secrets:
  - cert_reloader_ssh_key

Reference