Environment_Infrastructure/roadmap/prod-env/03-infra-stack-changes.md
Murat ÖZDEMİR 5ddba7eba4 docs: update production roadmap for HA Vault and shared storage
- Refactor production setup documentation to reflect a 3-node Vault Raft cluster starting from launch.
- Update all paths to use StorageBox mounts for shared state (SWAG config, TLS certs, Monitoring data).
- Switch Nginx configuration convention from proxy-confs to site-confs to align with SWAG's auto-include behavior.
- Standardize TLS private key extensions to .pem.
- Update node failover and recovery facts to include monitoring services.
- Align deployment pipeline instructions with the latest environment variable-driven approach.
2026-05-16 16:18:21 +03:00

18 KiB
Raw Blame History

03 — docker-stack-infra.yml Changes (Prod)

Context

File strategy — overlay approach

Prod-specific service changes are not written directly into docker-stack-infra.yml; they are kept in a separate overlay file:

File Usage
docker-stack-infra.yml Base — works as-is for test
docker-stack-infra.prod.yml Prod overlay — additional services and overrides
# Test deploy:
docker stack deploy -c docker-stack-infra.yml iklimco

# Prod deploy (Swarm merges both files):
docker stack deploy -c docker-stack-infra.yml -c docker-stack-infra.prod.yml iklimco

Docker Swarm merge rule: if the same service name appears in both files, the overlay wins (deploy, environment, etc.); services only present in the overlay are added.

Prod-specific changes summary

  • APISIX: 1 → 3 replicas (overlay override)
  • Redis: single-instance → Sentinel cluster — 1 master + 2 replicas + 3 sentinels (overlay adds new services)
  • RabbitMQ: 1 → 3-node Erlang cluster (overlay override + env)
  • Vault: 1 → 3-node Raft cluster (overlay override) — see 07-vault-raft-plan.md
  • No separate APISIX etcd: Patroni etcd is shared (/apisix prefix)
  • init/apisix-core/init.sh: when PROFILE=prod, rate limit policy:localpolicy:redis

swag-vl volume — not used in prod, not defined in overlay

Test-env Step 9 adds the swag-vl named volume to the base file. In prod, SWAG mounts to the StorageBox via the ${SWAG_CONFIG_DIR} env var, so this volume is unused by any service. No need to remove it in the overlay — Swarm does not create unused volume definitions, it remains harmless.

No swag-vl definition is made in docker-stack-infra.prod.yml.

Monitoring Persistence (StorageBox)

Prometheus and Grafana run as single instances. To ensure monitoring data and dashboards survive a node failover (moving from iklim-app-01 to another node), their data is stored on the shared StorageBox:

  • Prometheus: /mnt/storagebox/prometheus/data
  • Grafana: /mnt/storagebox/grafana/data

These paths are mounted via env vars (PROMETHEUS_DATA_DIR, GRAFANA_DATA_DIR) with named-volume fallbacks for test. See Step 8 for implementation details.

Note: PostgreSQL and MongoDB are not in docker-stack-infra.yml. They run in separate stacks on DB nodes (iklim-db and iklim-patroni). See 08-prod-db-cluster-kurulum.md.

Step 1 — Apply all test-env changes first

Follow every step in test-env/03-infra-stack-changes.md:

  • Add swag service
  • Add cert-reloader service
  • Remove published ports for vault, apisix, rabbitmq, prometheus, grafana, apisix-dashboard
  • Add swag-vl volume

Step 2 — Vault: 3-node Raft cluster (prod)

Vault starts directly with 3 replicas; the Phase 1 single-instance stage is skipped in prod. See 07-vault-raft-plan.md Phase 2 for detailed setup steps.

vault:
  deploy:
    mode: replicated
    replicas: 3
    placement:
      constraints:
        - node.labels.type == service

Step 3 — APISIX: 3 replicas + init.sh rate limit update (prod overlay)

Add to docker-stack-infra.prod.yml:

# docker-stack-infra.prod.yml
services:
  apisix:
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service

  apisix-dashboard:
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service

APISIX and apisix-dashboard are stateless (config lives in Patroni etcd) — 3 replicas is safe. Swarm distributes SWAG requests to APISIX replicas via VIP (IPVS round-robin).

init.sh — rate limit policy:redis (prod)

With policy:local, each APISIX instance counts independently → the global limit effectively becomes 3× with 3 replicas. Switch to policy:redis for PROFILE=prod.

Update the global rate limit block in init/apisix-core/init.sh:

if [[ "$PROFILE" != "dev" ]]; then
  if [[ "$PROFILE" == "prod" ]]; then
    RATE_POLICY="redis"
    RATE_REDIS=',\"redis_host\":\"redis-master\",\"redis_port\":6379,\"redis_password\":\"'\"$REDIS_PASSWORD\"'\"'
  else
    RATE_POLICY="local"
    RATE_REDIS=""
  fi

  call_api "global rate limit" -X PUT "$APISIX_ADMIN_URL/global_rules/1" \
    -H "X-API-KEY: $API_KEY" -H "Content-Type: application/json" \
    -d '{"plugins":{"limit-count":{"count":300,"time_window":60,"key_type":"var","key":"remote_addr","rejected_code":429,"policy":"'"$RATE_POLICY"'"'"$RATE_REDIS"'}}}'
fi

APISIX's limit-count plugin does not natively support Redis Sentinel; policy:redis works with a single endpoint. The redis-master service name stays constant within Swarm — during Sentinel failover (~10-30 s) rate limiting may be temporarily inconsistent; this brief disruption is acceptable. Microservices use Spring Data Redis Sentinel natively.

Step 4 — etcd: Separate APISIX etcd removed — Patroni etcd shared

The standalone etcd service in docker-stack-infra.yml is not used in prod and must be removed. APISIX uses the 3-node Patroni etcd cluster running on DB nodes, via the /apisix prefix.

Why consolidated?

  • A standalone single-instance etcd was a SPOF for APISIX.
  • Patroni etcd is already 3-node HA — APISIX gets a more reliable config store.
  • etcd supports prefix-based namespacing; Patroni uses /service/, APISIX uses /apisix/ — no collision.

APISIX etcd connection configuration

Update the etcd endpoints in the APISIX service in docker-stack-infra.yml to point to DB nodes:

apisix:
  environment:
    APISIX_STAND_ALONE: "false"
  # via apisix/conf/config.yaml or environment:
  # etcd:
  #   host:
  #     - "http://iklim-db-01:2379"
  #     - "http://iklim-db-02:2379"
  #     - "http://iklim-db-03:2379"
  #   prefix: "/apisix"

The preferred method is mounting config.yaml via a Docker config or volume:

# config/apisix/config.yaml
etcd:
  host:
    - "http://iklim-db-01:2379"
    - "http://iklim-db-02:2379"
    - "http://iklim-db-03:2379"
  prefix: "/apisix"
  timeout: 30

Firewall requirement

etcd access from app nodes to DB nodes must be open:

# Each app node → each db node, port 2379
# If inside Hetzner private network it may be open by default;
# verify there are no ufw/firewalld rules blocking it:
nc -zv iklim-db-01 2379

Note: Docker Compose overlay files can only add/override services, not remove them. The standalone etcd service remains in the base stack and runs as an idle container in prod — APISIX connects to Patroni etcd instead (via config.yaml in the prod overlay). This is harmless; etcd uses negligible resources with no active clients.

Step 5 — Redis: Sentinel cluster (prod overlay)

Redis runs as a single instance in test. In prod, Sentinel provides HA. Bitnami images are used — all configuration is done via env vars, no separate .conf file needed.

Prerequisites

# Create Docker secret for Redis password:
openssl rand -hex 32 | docker secret create redis_password -

Topology

iklim-app-01: redis-master    (1 replica, pinned to app-01)
iklim-app-02: redis-replica   (1 replica, pinned to app-02)
iklim-app-03: redis-replica   (1 replica, pinned to app-03)
iklim-app-01: redis-sentinel  ┐
iklim-app-02: redis-sentinel  ├─ 3 replicas, spread across all app nodes
iklim-app-03: redis-sentinel  ┘

docker-stack-infra.prod.yml — Redis services

The existing redis service is overridden in the prod overlay as master; redis-replica and redis-sentinel are added as new services. The service name (redis) remains unchanged so the APISIX connection config does not need updating.

# docker-stack-infra.prod.yml
services:
  redis:                          # override base single-instance redis → master
    image: bitnamisecure/redis:latest
    environment:
      ALLOW_EMPTY_PASSWORD: no
      REDIS_PASSWORD: ${REDIS_PASSWORD}
      REDIS_REPLICATION_MODE: master
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-app-01
      restart_policy:
        condition: any
        delay: 5s
    labels:
      project: co.iklim

  redis-replica:
    image: bitnamisecure/redis:latest
    environment:
      ALLOW_EMPTY_PASSWORD: no
      REDIS_REPLICATION_MODE: slave
      REDIS_MASTER_HOST: redis
      REDIS_MASTER_PORT_NUMBER: "6379"
      REDIS_MASTER_PASSWORD: ${REDIS_PASSWORD}
      REDIS_PASSWORD: ${REDIS_PASSWORD}
    deploy:
      mode: replicated
      replicas: 2
      placement:
        constraints:
          - node.labels.type == service
        preferences:
          - spread: node.hostname
      restart_policy:
        condition: any
        delay: 5s
    labels:
      project: co.iklim

  redis-sentinel:
    image: bitnamisecure/redis-sentinel:latest
    environment:
      REDIS_SENTINEL_MASTER_NAME: mymaster
      REDIS_MASTER_HOST: redis
      REDIS_MASTER_PORT_NUMBER: "6379"
      REDIS_MASTER_PASSWORD: ${REDIS_PASSWORD}
      REDIS_SENTINEL_QUORUM: "2"
      REDIS_SENTINEL_DOWN_AFTER_MILLISECONDS: "5000"
      REDIS_SENTINEL_FAILOVER_TIMEOUT: "10000"
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service
        preferences:
          - spread: node.hostname
      restart_policy:
        condition: any
        delay: 5s
    labels:
      project: co.iklim

Microservice connection (Spring Data Redis)

Microservices must use a Sentinel-aware connection:

# application-prod.yml
spring:
  data:
    redis:
      sentinel:
        master: mymaster
        nodes:
          - redis-sentinel:26379
      password: ${REDIS_PASSWORD}

Verification

# Query master identity:
docker exec $(docker ps -q -f name=iklimco_redis-sentinel | head -1) \
  redis-cli -p 26379 SENTINEL get-master-addr-by-name mymaster

Step 6 — RabbitMQ: 3-node Erlang cluster (prod overlay)

RabbitMQ runs as a 3-node cluster with one instance per app node.

Prerequisites

# Create Docker secret for Erlang cookie (must be identical on all nodes):
openssl rand -hex 32 | docker secret create rabbitmq_erlang_cookie -

docker-stack-infra.prod.yml — RabbitMQ override

# docker-stack-infra.prod.yml (add alongside redis services)
services:
  rabbitmq:
    image: rabbitmq:3-management
    hostname: "rabbitmq-{{.Node.Hostname}}"
    environment:
      RABBITMQ_ERLANG_COOKIE_FILE: /run/secrets/rabbitmq_erlang_cookie
      RABBITMQ_USE_LONGNAME: "true"
      RABBITMQ_NODENAME: "rabbit@rabbitmq-{{.Node.Hostname}}"
    secrets:
      - rabbitmq_erlang_cookie
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service
      update_config:
        parallelism: 1
        order: stop-first
    labels:
      project: co.iklim

secrets:
  rabbitmq_erlang_cookie:
    external: true

Cluster join procedure (first setup)

RabbitMQ nodes do not form a cluster automatically; manual join is required after first start:

# Find the RabbitMQ container on iklim-app-02:
CTR=$(docker ps -q -f name=iklimco_rabbitmq)

# Stop, join, start:
docker exec "$CTR" rabbitmqctl stop_app
docker exec "$CTR" rabbitmqctl join_cluster rabbit@rabbitmq-iklim-app-01
docker exec "$CTR" rabbitmqctl start_app

# Repeat for iklim-app-03
# Verify cluster status (from any node):
docker exec "$CTR" rabbitmqctl cluster_status

HA policy: After the cluster is formed, set quorum queues as the default:

docker exec "$CTR" rabbitmqctl set_policy ha-all ".*" \
  '{"queue-type":"quorum"}' --apply-to queues

Step 7 — Create docker-stack-infra.prod.yml

Create this file in the repo root alongside docker-stack-infra.yml. It combines all prod-specific overrides from Steps 26:

# docker-stack-infra.prod.yml
# Prod overlay — deploy with:
#   docker stack deploy -c docker-stack-infra.yml -c docker-stack-infra.prod.yml iklimco

services:

  vault:
    environment:
      VAULT_LOCAL_CONFIG: >-
        {"api_addr":"https://vault.iklim.co:8200",
         "cluster_addr":"https://{{ .Node.Hostname }}:8201",
         "storage":{"raft":{"path":"/vault/file","node_id":"{{ .Node.Hostname }}"}},
         "listener":[{"tcp":{"address":"0.0.0.0:8200",
           "tls_cert_file":"/vault/certs/STAR.iklim.co.full.crt",
           "tls_key_file":"/vault/certs/STAR.iklim.co_key.pem"}}],
         "default_lease_ttl":"168h","max_lease_ttl":"720h","ui":true}
    volumes:
      - /opt/iklimco/vault/data:/vault/file
      - /mnt/storagebox/ssl:/vault/certs:ro
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service

  apisix:
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service

  apisix-dashboard:
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service

  redis:
    image: bitnamisecure/redis:latest
    environment:
      ALLOW_EMPTY_PASSWORD: no
      REDIS_PASSWORD: ${REDIS_PASSWORD}
      REDIS_REPLICATION_MODE: master
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-app-01
      restart_policy:
        condition: any
        delay: 5s
    labels:
      project: co.iklim

  redis-replica:
    image: bitnamisecure/redis:latest
    environment:
      ALLOW_EMPTY_PASSWORD: no
      REDIS_REPLICATION_MODE: slave
      REDIS_MASTER_HOST: redis
      REDIS_MASTER_PORT_NUMBER: "6379"
      REDIS_MASTER_PASSWORD: ${REDIS_PASSWORD}
      REDIS_PASSWORD: ${REDIS_PASSWORD}
    deploy:
      mode: replicated
      replicas: 2
      placement:
        constraints:
          - node.labels.type == service
        preferences:
          - spread: node.hostname
      restart_policy:
        condition: any
        delay: 5s
    labels:
      project: co.iklim

  redis-sentinel:
    image: bitnamisecure/redis-sentinel:latest
    environment:
      REDIS_SENTINEL_MASTER_NAME: mymaster
      REDIS_MASTER_HOST: redis
      REDIS_MASTER_PORT_NUMBER: "6379"
      REDIS_MASTER_PASSWORD: ${REDIS_PASSWORD}
      REDIS_SENTINEL_QUORUM: "2"
      REDIS_SENTINEL_DOWN_AFTER_MILLISECONDS: "5000"
      REDIS_SENTINEL_FAILOVER_TIMEOUT: "10000"
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service
        preferences:
          - spread: node.hostname
      restart_policy:
        condition: any
        delay: 5s
    labels:
      project: co.iklim

  rabbitmq:
    image: rabbitmq:3-management
    hostname: "rabbitmq-{{.Node.Hostname}}"
    environment:
      RABBITMQ_ERLANG_COOKIE_FILE: /run/secrets/rabbitmq_erlang_cookie
      RABBITMQ_USE_LONGNAME: "true"
      RABBITMQ_NODENAME: "rabbit@rabbitmq-{{.Node.Hostname}}"
    secrets:
      - rabbitmq_erlang_cookie
    deploy:
      mode: replicated
      replicas: 3
      placement:
        constraints:
          - node.labels.type == service
      update_config:
        parallelism: 1
        order: stop-first
    labels:
      project: co.iklim

secrets:
  rabbitmq_erlang_cookie:
    external: true

Step 8 — Monitoring Data Persistence (StorageBox)

Prometheus and Grafana run as single instances. Without persistent storage, data is lost on node failover. This step mounts their data directories from the StorageBox shared filesystem.

Changes already applied to docker-stack-infra.yml:

prometheus:
  volumes:
    - ${PROMETHEUS_DATA_DIR:-prometheus-vl}:/prometheus

grafana:
  volumes:
    - ${GRAFANA_DATA_DIR:-grafana-vl}:/var/lib/grafana

Test uses the named Docker volume fallbacks (prometheus-vl, grafana-vl) — no test env change needed.

Add to prod/secrets/iklim.co/.env.prod on storagebox (already in env-prod/.env):

PROMETHEUS_DATA_DIR=/mnt/storagebox/prometheus/data
GRAFANA_DATA_DIR=/mnt/storagebox/grafana/data

Create directories on StorageBox before first prod deploy:

mkdir -p /mnt/storagebox/prometheus/data /mnt/storagebox/grafana/data

Grafana writes its SQLite database and dashboard JSON to /var/lib/grafana. Prometheus writes its TSDB to /prometheus. Both directories must exist before the stack starts.

Step 9 — Verify

# Base file must be valid on its own (test deploy):
docker stack config -c docker-stack-infra.yml > /dev/null && echo "base OK"

# Prod merge must be valid:
docker stack config -c docker-stack-infra.yml -c docker-stack-infra.prod.yml > /dev/null && echo "prod merge OK"

Placement and Replica Summary — prod

Service File Replicas Placement HA Note
swag base 1 node.hostname == iklim-app-01 No clustering support; Floating IP pinned to node
cert-reloader base 1 node.hostname == iklim-app-01 Cron-style task; duplicate would be problematic
vault prod overlay 3 node.labels.type == service Raft cluster — see 07-vault-raft-plan.md
apisix prod overlay 3 node.labels.type == service Stateless; config in Patroni etcd; rate limit policy:redis
apisix-dashboard prod overlay 3 node.labels.type == service Stateless; reads from etcd
redis (master) prod overlay 1 node.hostname == iklim-app-01 Sentinel cluster master
redis-replica prod overlay 2 node.labels.type == service Sentinel replica; spread:hostname
redis-sentinel prod overlay 3 node.labels.type == service Quorum=2; failover automatic
rabbitmq prod overlay 3 node.labels.type == service Erlang cluster; quorum queues
etcd base 1 node.labels.type == service Idle in prod — APISIX uses Patroni etcd; standalone service remains in base stack
prometheus base 1 node.labels.type == service No native HA; Thanos is overkill at this scale
grafana base 1 node.labels.type == service Not critical

PostgreSQL and MongoDB run in separate DB stacks on iklim-db-* nodes. See 08-prod-db-cluster-kurulum.md. etcd: 3-node cluster on DB nodes — APISIX shares it via /apisix prefix.