Environment_Infrastructure/setup/08-prod-db-cluster-kurulum.md
Murat ÖZDEMİR 8780c7c05e docs(db): implement direct cluster access strategy for production
- Updated roadmap (03-infra-stack-changes.md) to deprecate database proxies in prod.
- Detailed direct subnet access via WireGuard for production developers.
- Provided multi-host connection parameters for Patroni and MongoDB Replica Sets in setup guide (08-prod-db-cluster-kurulum.md).
- Added environment comparison table to developer access guide.
2026-05-18 14:25:26 +03:00

26 KiB
Raw Blame History

08 - Prod DB Cluster Setup (Swarm)

The purpose of this phase is to add the three DB nodes to Docker Swarm as workers and configure the MongoDB replica set and the PostgreSQL high-availability setup managed with Patroni + etcd.

07-prod-ansible-bootstrap.md must be completed on all DB nodes.

Architecture

iklim-app-01/02/03  (Swarm manager'lar, 10.20.10.11/12/13)
        |
        |-- iklimco-net (overlay)
        |
iklim-db-01  (Swarm worker, 10.20.20.11)
    mongodb-01   [rs0 member 0 — preferred primary]
    etcd-01      [etcd cluster member]
    patroni-01   [Patroni + PostgreSQL — first primary candidate]

iklim-db-02  (Swarm worker, 10.20.20.12)
    mongodb-02   [rs0 member 1]
    etcd-02      [etcd cluster member]
    patroni-02   [Patroni + PostgreSQL — standby]

iklim-db-03  (Swarm worker, 10.20.20.13)
    mongodb-03   [rs0 member 2]
    etcd-03      [etcd cluster member]
    patroni-03   [Patroni + PostgreSQL — standby]

DB containers discover each other through Hetzner private IPs, not overlay DNS names. Therefore, each service publishes its port in host mode; replication and etcd traffic goes directly through the private network. The Hetzner Cloud firewall and the prod db firewall already allow these ports.

1. Firewall Update

Verify that the following rules exist in terraform/hetzner/prod/firewall.tf; if any are missing, add them and run terraform apply.

Inside hcloud_firewall.swarm, from the DB subnet to Swarm ports:

rule {
  direction   = "in"
  protocol    = "tcp"
  port        = "2377"
  source_ips  = [local.db_subnet_cidr]
  description = "Docker Swarm control plane from DB subnet"
}

rule {
  direction   = "in"
  protocol    = "tcp"
  port        = "7946"
  source_ips  = [local.db_subnet_cidr]
  description = "Docker Swarm node discovery (TCP) from DB subnet"
}

rule {
  direction   = "in"
  protocol    = "udp"
  port        = "7946"
  source_ips  = [local.db_subnet_cidr]
  description = "Docker Swarm node discovery (UDP) from DB subnet"
}

rule {
  direction   = "in"
  protocol    = "udp"
  port        = "4789"
  source_ips  = [local.db_subnet_cidr]
  description = "Docker Swarm VXLAN overlay from DB subnet"
}

Inside hcloud_firewall.db, from the app subnet to Swarm ports + overlay, and etcd/Patroni traffic inside the DB subnet:

rule {
  direction   = "in"
  protocol    = "tcp"
  port        = "2377"
  source_ips  = [local.app_subnet_cidr]
  description = "Docker Swarm control plane from app subnet"
}

rule {
  direction   = "in"
  protocol    = "tcp"
  port        = "7946"
  source_ips  = [local.app_subnet_cidr]
  description = "Docker Swarm node discovery (TCP) from app subnet"
}

rule {
  direction   = "in"
  protocol    = "udp"
  port        = "7946"
  source_ips  = [local.app_subnet_cidr]
  description = "Docker Swarm node discovery (UDP) from app subnet"
}

rule {
  direction   = "in"
  protocol    = "udp"
  port        = "4789"
  source_ips  = [local.app_subnet_cidr]
  description = "Docker Swarm VXLAN overlay from app subnet"
}

rule {
  direction   = "in"
  protocol    = "tcp"
  port        = "2379"
  source_ips  = [local.db_subnet_cidr]
  description = "etcd client port within DB subnet"
}

rule {
  direction   = "in"
  protocol    = "tcp"
  port        = "2379"
  source_ips  = [local.app_subnet_cidr]
  description = "etcd client port from app subnet (APISIX connects to Patroni etcd)"
}

rule {
  direction   = "in"
  protocol    = "tcp"
  port        = "2380"
  source_ips  = [local.db_subnet_cidr]
  description = "etcd peer port within DB subnet"
}

rule {
  direction   = "in"
  protocol    = "tcp"
  port        = "8008"
  source_ips  = [local.db_subnet_cidr]
  description = "Patroni REST API within DB subnet"
}
cd terraform/hetzner/prod
terraform plan
terraform apply

2. Add DB Nodes to Swarm

Swarm manager'lardan birinde (iklim-app-01) join token al:

docker swarm join-token worker

Her DB node'unda (iklim-db-01, iklim-db-02, iklim-db-03):

docker swarm join --token <TOKEN> 10.20.10.11:2377

Label the nodes on iklim-app-01:

docker node update --label-add role=db --label-add db-index=01 iklim-db-01
docker node update --label-add role=db --label-add db-index=02 iklim-db-02
docker node update --label-add role=db --label-add db-index=03 iklim-db-03

docker node ls

3. StorageBox Directory Structure

On each DB node, where /mnt/storagebox must already be mounted:

# On iklim-db-01:
mkdir -p /mnt/storagebox/prod/db/mongodb-01/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-01/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-01/data

# On iklim-db-02:
mkdir -p /mnt/storagebox/prod/db/mongodb-02/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-02/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-02/data

# On iklim-db-03:
mkdir -p /mnt/storagebox/prod/db/mongodb-03/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-03/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-03/data

4. MongoDB Replica Set

mongod.conf

Her DB node'unda /mnt/storagebox/prod/db/mongodb-0X/config/mongod.conf:

net:
  port: 27017
storage:
  engine: "wiredTiger"
  dbPath: "/data/db"
  directoryPerDB: true
systemLog:
  verbosity: 0
  timeStampFormat: "iso8601-local"
  destination: file
  path: "/data/log/mongo.log"
  logAppend: true
  logRotate: rename
replication:
  replSetName: "rs0"
security:
  authorization: enabled
  keyFile: "/data/configdb/rs-auth.key"

Replica Set Auth Key

The same key file must exist on all DB nodes:

# Create on iklim-db-01:
openssl rand -base64 756 > /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key
chmod 400 /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key

# Copy the same content to the other nodes:
cat /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key \
  > /mnt/storagebox/prod/db/mongodb-02/config/rs-auth.key
cat /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key \
  > /mnt/storagebox/prod/db/mongodb-03/config/rs-auth.key

chmod 400 /mnt/storagebox/prod/db/mongodb-0{2,3}/config/rs-auth.key

Stack File — MongoDB

/opt/iklimco/stacks/prod-db-mongo.yml:

version: "3.8"

networks:
  iklimco-net:
    external: true

services:
  mongodb-01:
    image: mongo:8
    environment:
      MONGO_INITDB_ROOT_USERNAME: mongo-root
      MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
    volumes:
      - /mnt/storagebox/prod/db/mongodb-01/data:/data/db
      - /mnt/storagebox/prod/db/mongodb-01/log:/data/log
      - /mnt/storagebox/prod/db/mongodb-01/config:/data/configdb
    networks:
      - iklimco-net
    ports:
      - target: 27017
        published: 27017
        protocol: tcp
        mode: host
    command: ["--config", "/data/configdb/mongod.conf"]
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-01
      restart_policy:
        condition: on-failure

  mongodb-02:
    image: mongo:8
    environment:
      MONGO_INITDB_ROOT_USERNAME: mongo-root
      MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
    volumes:
      - /mnt/storagebox/prod/db/mongodb-02/data:/data/db
      - /mnt/storagebox/prod/db/mongodb-02/log:/data/log
      - /mnt/storagebox/prod/db/mongodb-02/config:/data/configdb
    networks:
      - iklimco-net
    ports:
      - target: 27017
        published: 27017
        protocol: tcp
        mode: host
    command: ["--config", "/data/configdb/mongod.conf"]
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-02
      restart_policy:
        condition: on-failure

  mongodb-03:
    image: mongo:8
    environment:
      MONGO_INITDB_ROOT_USERNAME: mongo-root
      MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
    volumes:
      - /mnt/storagebox/prod/db/mongodb-03/data:/data/db
      - /mnt/storagebox/prod/db/mongodb-03/log:/data/log
      - /mnt/storagebox/prod/db/mongodb-03/config:/data/configdb
    networks:
      - iklimco-net
    ports:
      - target: 27017
        published: 27017
        protocol: tcp
        mode: host
    command: ["--config", "/data/configdb/mongod.conf"]
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-03
      restart_policy:
        condition: on-failure

Replica Set Initialization

Run once after the stack is deployed:

# On iklim-db-01:
docker exec -it $(docker ps -q -f name=iklim-db_mongodb-01) mongosh \
  -u mongo-root -p "${MONGO_ROOT_PASSWORD}" --authenticationDatabase admin

# Inside mongosh:
rs.initiate({
  _id: "rs0",
  members: [
    { _id: 0, host: "10.20.20.11:27017", priority: 2 },
    { _id: 1, host: "10.20.20.12:27017", priority: 1 },
    { _id: 2, host: "10.20.20.13:27017", priority: 1 }
  ]
})

# Status check:
rs.status()

The replica set is ready when "stateStr": "PRIMARY" and two "SECONDARY" entries are visible.

5. PostgreSQL — Patroni + etcd

Patroni coordinates PostgreSQL primary/standby roles through etcd. If the primary goes down, one of the other nodes automatically wins the election and becomes primary. The Swarm service restarts the container; Patroni continues from where it left off.

5.1 Custom Image (Patroni + PostGIS)

Patroni is installed on top of the postgis/postgis:17-3.5 image. This image is pushed to Harbor and used in the stack.

Environment_Infrastructure/docker/patroni-postgis/Dockerfile:

FROM postgis/postgis:17-3.5

USER root

RUN apt-get update && apt-get install -y --no-install-recommends \
    python3-pip \
    python3-dev \
    gcc \
    libpq-dev \
    && pip3 install --no-cache-dir 'patroni[etcd3]' \
    && apt-get purge -y gcc python3-dev \
    && apt-get autoremove -y \
    && rm -rf /var/lib/apt/lists/*

USER postgres

ENTRYPOINT ["patroni", "/etc/patroni/patroni.yml"]

Build and push; this is done with ops/push-harbor-custom-images.sh, or run the commands below:

cd Environment_Infrastructure/docker/patroni-postgis
docker build -t registry.tarla.io/iklimco/patroni-postgis:17-3.5 .
echo "$HARBOR_CI_TOKEN" | docker login registry.tarla.io -u robot-ci-push-iklimco --password-stdin
docker push registry.tarla.io/iklimco/patroni-postgis:17-3.5

5.2 etcd Cluster

Stack File — etcd

/opt/iklimco/stacks/prod-db-etcd.yml:

version: "3.8"

networks:
  iklimco-net:
    external: true

services:
  etcd-01:
    image: bitnami/etcd:3
    environment:
      ALLOW_NONE_AUTHENTICATION: "yes"
      ETCD_NAME: etcd-01
      ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.11:2380
      ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
      ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.11:2379
      ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
      ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
      ETCD_INITIAL_CLUSTER_STATE: new
      ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
    volumes:
      - /mnt/storagebox/prod/db/etcd-01/data:/bitnami/etcd/data
    networks:
      - iklimco-net
    ports:
      - target: 2379
        published: 2379
        protocol: tcp
        mode: host
      - target: 2380
        published: 2380
        protocol: tcp
        mode: host
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-01
      restart_policy:
        condition: on-failure

  etcd-02:
    image: bitnami/etcd:3
    environment:
      ALLOW_NONE_AUTHENTICATION: "yes"
      ETCD_NAME: etcd-02
      ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.12:2380
      ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
      ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.12:2379
      ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
      ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
      ETCD_INITIAL_CLUSTER_STATE: new
      ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
    volumes:
      - /mnt/storagebox/prod/db/etcd-02/data:/bitnami/etcd/data
    networks:
      - iklimco-net
    ports:
      - target: 2379
        published: 2379
        protocol: tcp
        mode: host
      - target: 2380
        published: 2380
        protocol: tcp
        mode: host
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-02
      restart_policy:
        condition: on-failure

  etcd-03:
    image: bitnami/etcd:3
    environment:
      ALLOW_NONE_AUTHENTICATION: "yes"
      ETCD_NAME: etcd-03
      ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.13:2380
      ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
      ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.13:2379
      ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
      ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
      ETCD_INITIAL_CLUSTER_STATE: new
      ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
    volumes:
      - /mnt/storagebox/prod/db/etcd-03/data:/bitnami/etcd/data
    networks:
      - iklimco-net
    ports:
      - target: 2379
        published: 2379
        protocol: tcp
        mode: host
      - target: 2380
        published: 2380
        protocol: tcp
        mode: host
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-03
      restart_policy:
        condition: on-failure

APISIX etcd usage: In prod, APISIX shares this etcd cluster with the /apisix prefix. Patroni uses the /service/ prefix and APISIX uses the /apisix/ prefix, so there is no collision. APISIX configuration is managed by the config.yaml file in the docker-stack-infra.prod.yml overlay; the connection is made to http://iklim-db-01:2379,http://iklim-db-02:2379,http://iklim-db-03:2379. Therefore, the app subnet -> DB nodes port 2379 firewall rule is mandatory; it was added in Section 1.

Important: ETCD_INITIAL_CLUSTER_STATE must be new on the first deploy and existing on all later deploys. If the wrong value is left in place, the data directory is reset. The deploy steps in Section 6 below detect this automatically; no manual update is required.

5.3 Patroni Configuration

A separate patroni.yml file is created for each node. The only differences are the name and connect_address fields.

Node 01/mnt/storagebox/prod/db/postgresql-01/config/patroni.yml:

scope: iklim-postgres
namespace: /db/
name: postgresql-01

restapi:
  listen: 0.0.0.0:8008
  connect_address: 10.20.20.11:8008

etcd3:
  hosts:
    - 10.20.20.11:2379
    - 10.20.20.12:2379
    - 10.20.20.13:2379

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      parameters:
        wal_level: replica
        hot_standby: "on"
        wal_keep_size: 512
        max_wal_senders: 5
        max_replication_slots: 5
        shared_preload_libraries: 'pg_stat_statements'
        pg_stat_statements.track: 'all'

  initdb:
    - encoding: UTF8
    - data-checksums

  pg_hba:
    - host replication replicator 10.20.20.0/24 scram-sha-256
    - host all all 10.20.10.0/24 scram-sha-256
    - host all all 10.20.20.0/24 scram-sha-256

  users:
    postgres:
      password: "${POSTGRES_PASSWORD}"
      options:
        - superuser

postgresql:
  listen: 0.0.0.0:5432
  connect_address: 10.20.20.11:5432
  data_dir: /var/lib/postgresql/data/pgdata
  pgpass: /tmp/pgpass0
  authentication:
    replication:
      username: replicator
      password: "${REPLICATOR_PASSWORD}"
    superuser:
      username: postgres
      password: "${POSTGRES_PASSWORD}"
  parameters:
    unix_socket_directories: "/var/run/postgresql"

tags:
  nofailover: false
  noloadbalance: false
  clonefrom: false
  nosync: false

Node 02/mnt/storagebox/prod/db/postgresql-02/config/patroni.yml:

Same content as Node 01; only the following fields differ:

name: postgresql-02

restapi:
  connect_address: 10.20.20.12:8008

postgresql:
  connect_address: 10.20.20.12:5432
  data_dir: /var/lib/postgresql/data/pgdata

Node 03/mnt/storagebox/prod/db/postgresql-03/config/patroni.yml:

name: postgresql-03

restapi:
  connect_address: 10.20.20.13:8008

postgresql:
  connect_address: 10.20.20.13:5432
  data_dir: /var/lib/postgresql/data/pgdata

5.4 Stack File — Patroni

/opt/iklimco/stacks/prod-db-patroni.yml:

version: "3.8"

networks:
  iklimco-net:
    external: true

services:
  patroni-01:
    image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
    environment:
      DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
      POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
      REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
      TZ: "Europe/Istanbul"
    volumes:
      - /mnt/storagebox/prod/db/postgresql-01/data:/var/lib/postgresql/data
      - /mnt/storagebox/prod/db/postgresql-01/config/patroni.yml:/etc/patroni/patroni.yml:ro
    networks:
      - iklimco-net
    ports:
      - target: 5432
        published: 5432
        protocol: tcp
        mode: host
      - target: 8008
        published: 8008
        protocol: tcp
        mode: host
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-01
      restart_policy:
        condition: on-failure

  patroni-02:
    image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
    environment:
      DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
      POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
      REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
      TZ: "Europe/Istanbul"
    volumes:
      - /mnt/storagebox/prod/db/postgresql-02/data:/var/lib/postgresql/data
      - /mnt/storagebox/prod/db/postgresql-02/config/patroni.yml:/etc/patroni/patroni.yml:ro
    networks:
      - iklimco-net
    ports:
      - target: 5432
        published: 5432
        protocol: tcp
        mode: host
      - target: 8008
        published: 8008
        protocol: tcp
        mode: host
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-02
      restart_policy:
        condition: on-failure

  patroni-03:
    image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
    environment:
      DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
      POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
      REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
      TZ: "Europe/Istanbul"
    volumes:
      - /mnt/storagebox/prod/db/postgresql-03/data:/var/lib/postgresql/data
      - /mnt/storagebox/prod/db/postgresql-03/config/patroni.yml:/etc/patroni/patroni.yml:ro
    networks:
      - iklimco-net
    ports:
      - target: 5432
        published: 5432
        protocol: tcp
        mode: host
      - target: 8008
        published: 8008
        protocol: tcp
        mode: host
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.hostname == iklim-db-03
      restart_policy:
        condition: on-failure

5.5 Status Check

# On any DB node:
docker exec -it $(docker ps -q -f name=iklim-patroni_patroni-01) \
  patronictl -c /etc/patroni/patroni.yml list

Expected output: one Leader row and two Replica rows, all with the State column set to running.

# etcd cluster health:
docker exec -it $(docker ps -q -f name=iklim-etcd_etcd-01) \
  etcdctl endpoint health \
  --endpoints=http://10.20.20.11:2379,http://10.20.20.12:2379,http://10.20.20.13:2379
# Find the current primary:
docker exec -it $(docker ps -q -f name=iklim-patroni_patroni-01) \
  patronictl -c /etc/patroni/patroni.yml topology

6. Deploy

Order matters: etcd first, then the MongoDB and Patroni stacks.

.env File

The /opt/iklimco/stacks/.env file is stored on StorageBox as prod/secrets/iklim.co/.env.stacks. When it is created the first time, it is filled with strong passwords and uploaded to StorageBox; later deploys fetch it from there:

# On iklim-app-01, once:
scp -P 23 STORAGEBOX_USER@STORAGEBOX_USER.your-storagebox.de:prod/secrets/iklim.co/.env.stacks \
  /opt/iklimco/stacks/.env
chmod 600 /opt/iklimco/stacks/.env

File content (/opt/iklimco/stacks/.env, not committed to the repo):

DATABASE_POSTGRES_ROOT_USER=postgres
POSTGRES_PASSWORD=<strong-password>
REPLICATOR_PASSWORD=<strong-password>
MONGO_ROOT_PASSWORD=<strong-password>

Deploy Steps

# On iklim-app-01 (Swarm manager):
export $(cat /opt/iklimco/stacks/.env | xargs)

# Automatic ETCD_INITIAL_CLUSTER_STATE detection — 'new' on first deploy, 'existing' afterwards
ETCD_STATE="new"
if docker service ls --filter name=iklim-etcd -q 2>/dev/null | grep -q .; then
  echo " etcd services exist, using 'existing' state..."
  ETCD_STATE="existing"
else
  echo " First deploy, using 'new' state..."
fi
sed -i \
  "s/ETCD_INITIAL_CLUSTER_STATE: new/ETCD_INITIAL_CLUSTER_STATE: ${ETCD_STATE}/g; \
   s/ETCD_INITIAL_CLUSTER_STATE: existing/ETCD_INITIAL_CLUSTER_STATE: ${ETCD_STATE}/g" \
  /opt/iklimco/stacks/prod-db-etcd.yml
echo "✅ ETCD_INITIAL_CLUSTER_STATE=${ETCD_STATE}"

# 1. etcd cluster:
docker stack deploy \
  --compose-file /opt/iklimco/stacks/prod-db-etcd.yml \
  --with-registry-auth \
  iklim-etcd

# Wait for the etcd cluster to be ready:
echo "⏳ etcd bekleniyor..."
for i in $(seq 1 18); do
  if docker exec $(docker ps -q -f name=iklim-etcd_etcd-01 | head -1) \
      etcdctl endpoint health \
      --endpoints=http://10.20.20.11:2379,http://10.20.20.12:2379,http://10.20.20.13:2379 \
      2>/dev/null | grep -q "is healthy"; then
    echo "✅ etcd ready"
    break
  fi
  [ "$i" -eq 18 ] && echo "❌ etcd timeout" && exit 1
  echo "  attempt $i/18 — 10s bekleniyor..."
  sleep 10
done

# 2. MongoDB:
docker stack deploy \
  --compose-file /opt/iklimco/stacks/prod-db-mongo.yml \
  --with-registry-auth \
  iklim-db

# 3. Patroni (PostgreSQL):
docker stack deploy \
  --compose-file /opt/iklimco/stacks/prod-db-patroni.yml \
  --with-registry-auth \
  iklim-patroni

docker stack services iklim-etcd
docker stack services iklim-db
docker stack services iklim-patroni

MongoDB Replica Set Initialization

Run once after the MongoDB stack is deployed:

docker exec -it $(docker ps -q -f name=iklim-db_mongodb-01) mongosh \
  -u mongo-root -p "${MONGO_ROOT_PASSWORD}" --authenticationDatabase admin

# Inside mongosh:
rs.initiate({
  _id: "rs0",
  members: [
    { _id: 0, host: "10.20.20.11:27017", priority: 2 },
    { _id: 1, host: "10.20.20.12:27017", priority: 1 },
    { _id: 2, host: "10.20.20.13:27017", priority: 1 }
  ]
})

7. Access from App Services

App containers connect to DB services through the iklimco-net overlay network by Swarm DNS name. Because the MongoDB stack (iklim-db) and Patroni stack (iklim-patroni) share the iklimco-net external network, service names are resolved through overlay DNS.

MongoDB Replica Set Connection String

Variables in env-prod/.env:

DATABASE_MONGODB_HOST=mongodb-01:27017,mongodb-02:27017,mongodb-03:27017
DATABASE_MONGODB_PARAMS=replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin

Microservice URI through overlay DNS:

mongodb://<user>:<password>@mongodb-01:27017,mongodb-02:27017,mongodb-03:27017/<db>?replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin

For direct testing, from outside the overlay with private IP: mongodb://mongo-root:<PASSWORD>@10.20.20.11:27017,10.20.20.12:27017,10.20.20.13:27017/admin?replicaSet=rs0&authSource=admin

PostgreSQL — Patroni

Variables in env-prod/.env:

DATABASE_POSTGRES_HOST=patroni-01:5432,patroni-02:5432,patroni-03:5432
DATABASE_POSTGRES_PARAMS=targetServerType=preferSecondary&loadBalanceHosts=true

Patroni manages whichever node is primary at any moment. The JDBC/libpq driver automatically selects primary/secondary through the targetServerType parameter in the multi-host list:

# Write — goes to primary (libpq URI):
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=primary

# Read (load balancing):
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=preferSecondary&loadBalanceHosts=true

For direct testing, from outside the overlay with private IP: postgresql://postgres@10.20.20.11:5432,10.20.20.12:5432,10.20.20.13:5432/postgres?targetServerType=primary

The PostgreSQL JDBC/libpq driver connects to all listed nodes with targetServerType=primary and automatically finds the primary.

Patroni REST API

Patroni exposes an HTTP endpoint on port 8008. This endpoint can be used with HAProxy or a similar load balancer to route to the primary automatically:

# Primary check (HTTP 200 = primary, HTTP 503 = replica):
curl -s http://10.20.20.11:8008/primary

8. Geliştirici ve Ofis Erişimi (Production)

Prod cluster yapısında pg-proxy veya mongo-proxy kullanılmaz. Ofis bilgisayarından erişim için doğrudan DB subnet'i hedef alınır.

WireGuard Ayarı

Ofis bilgisayarındaki .conf dosyasında AllowedIPs güncellenmelidir: AllowedIPs = 10.8.0.1/32, 10.20.20.0/24

Bağlantı Parametreleri (Multi-Host)

Modern veritabanı araçları (DBeaver, Compass vb.) küme farkındalıklı bağlantı kurmalıdır:

Veritabanı Host Listesi Port Kritik Parametre
PostgreSQL 10.20.20.11, 10.20.20.12, 10.20.20.13 5432 targetServerType=primary
MongoDB 10.20.20.11, 10.20.20.12, 10.20.20.13 27017 replicaSet=rs0

Acceptance Criteria

  • docker stack services iklim-etcd — three services 1/1
  • docker stack services iklim-db — three MongoDB services 1/1
  • docker stack services iklim-patroni — three Patroni services 1/1
  • In the output of docker service ps iklim-patroni_patroni-01, patroni-02, and patroni-03, every task runs on an iklim-db-* node through the role=db placement constraint.
  • In the output of docker service ps iklim-db_mongodb-01, mongodb-02, and mongodb-03, every task runs on an iklim-db-* node.
  • In the output of docker service ps iklim-etcd_etcd-01, etcd-02, and etcd-03, every task runs on an iklim-db-* node.
  • patronictl list — 1 Leader, 2 Replica, all running
  • etcdctl endpoint health — three endpoints healthy
  • rs.status() — 1 PRIMARY, 2 SECONDARY
  • MongoDB and PostgreSQL are reachable from app nodes.
  • Ports 5432, 27017, 2379, 2380, and 8008 are closed from the public internet.
  • When a DB node is restarted, Patroni performs automatic election and a new primary is selected.
  • During Patroni primary transition, the old primary rejoins as standby; there is no split-brain.