Environment_Infrastructure/roadmap/prod-env/03-infra-stack-changes.md
Murat ÖZDEMİR 76f87aa2f9 Integrate DB nodes into Swarm and refine prod service deployment
- Database nodes now join the Docker Swarm as workers with `role=db` labels, allowing Swarm to manage their dedicated services.
- The `docker-stack-infra.yml` has been updated for production to focus solely on application-level infrastructure components.
- Dedicated database services (PostgreSQL, MongoDB, Patroni-etcd) are now explicitly deployed in separate Swarm stacks on `iklim-db-XX` nodes.
- Standardizes node naming conventions (`iklim-app-XX`, `iklim-db-XX`) across the production roadmap documentation.
- Clarifies that the `etcd` service within `docker-stack-infra.yml` is exclusively for APISIX configuration, distinct from Patroni's etcd cluster.
2026-05-11 14:53:21 +03:00

2.9 KiB

03 — docker-stack-infra.yml Changes (Prod)

Context

  • File: docker-stack-infra.yml (repo root — shared between test and prod)
  • All changes from test-env/03-infra-stack-changes.md apply here identically.
  • Additional prod-specific changes:
    • Microservices have no constraint (distributed across app nodes by Swarm).
    • Replica counts for stateless services are increased.
  • Note: PostgreSQL and MongoDB are not in docker-stack-infra.yml for prod. They run on dedicated DB nodes in separate stacks (iklim-db and iklim-patroni). See 08-prod-db-cluster-kurulum.md.

Step 1 — Apply all test-env changes first

Follow every step in test-env/03-infra-stack-changes.md:

  • Add swag service
  • Add cert-reloader service
  • Remove published ports for vault, apisix, rabbitmq, prometheus, grafana, apisix-dashboard
  • Add swag-vl volume

Step 2 — Pin Vault to manager node (initial prod — single instance)

Vault starts as a single instance pinned to the manager node. Raft cluster migration is handled separately in 07-vault-raft-plan.md.

# Vault placement stays as:
      placement:
        constraints:
          - node.role == manager

Step 3 — Increase APISIX replicas for prod

# CHANGE in apisix service deploy block:
      mode: replicated
      replicas: 2      # was 1

APISIX is stateless (config in etcd) — multiple replicas are safe. Swarm load-balances SWAG's requests across APISIX replicas via VIP.

Step 4 — etcd: single instance in docker-stack-infra.yml (APISIX config store only)

The etcd service in docker-stack-infra.yml is used exclusively by APISIX as its configuration store. It runs as a single instance on a manager node and is separate from the etcd cluster used by Patroni for PostgreSQL HA.

# etcd placement stays as:
      placement:
        constraints:
          - node.role == manager

The 3-node etcd cluster for Patroni/PostgreSQL HA is deployed separately via 08-prod-db-cluster-kurulum.md on the dedicated DB nodes. These are two independent etcd deployments with different purposes.

Step 5 — Verify the complete file

After all edits, validate the YAML:

docker stack config -c docker-stack-infra.yml > /dev/null && echo "YAML valid"

No output errors = valid.

Placement summary for prod (docker-stack-infra.yml only)

Service Placement
swag node.role == manager
cert-reloader node.role == manager
vault node.role == manager
apisix (2 replicas) no constraint (distributed across app nodes)
apisix-dashboard no constraint
redis node.role == manager
rabbitmq node.role == manager
etcd (APISIX store) node.role == manager
prometheus node.role == manager
grafana node.role == manager

PostgreSQL and MongoDB are deployed in separate DB stacks on iklim-db-* nodes. See 08-prod-db-cluster-kurulum.md for those stacks.