Update Environment_Infrastructure to match the current root stack conventions for database images, shared secret names, and APISIX real IP handling. - update test Ansible DB image defaults to PostGIS 18/PostGIS 3.6 and MongoDB 8.3.2 - align Patroni configuration with DATABASE_POSTGRES_* secret variable names - document APISIX real IP template configuration and Harbor rebuild workflow - replace the separate DB stack env file guidance with the shared .env.secrets.shared flow - update production setup and roadmap snippets to use current PostGIS, MongoDB, and APISIX rebuild commands
21 KiB
08 - Prod DB Cluster Setup (Swarm)
The purpose of this phase is to add the three DB nodes to Docker Swarm as workers and configure the MongoDB replica set and the PostgreSQL high-availability setup managed with Patroni + etcd.
07-prod-ansible-bootstrap.md must be completed on all DB nodes.
Architecture
iklim-app-01/02/03 (Swarm manager'lar, 10.20.10.11/12/13)
|
|-- iklimco-net (overlay)
|
iklim-db-01 (Swarm worker, 10.20.20.11)
mongodb-01 [rs0 member 0 — preferred primary]
etcd-01 [etcd cluster member]
patroni-01 [Patroni + PostgreSQL — first primary candidate]
iklim-db-02 (Swarm worker, 10.20.20.12)
mongodb-02 [rs0 member 1]
etcd-02 [etcd cluster member]
patroni-02 [Patroni + PostgreSQL — standby]
iklim-db-03 (Swarm worker, 10.20.20.13)
mongodb-03 [rs0 member 2]
etcd-03 [etcd cluster member]
patroni-03 [Patroni + PostgreSQL — standby]
DB containers discover each other through overlay DNS aliases (mongodb-01, etcd-01, patroni-01, etc.) on the shared iklimco-net overlay network. Each service publishes its port in host mode so replication traffic goes directly through the Hetzner private network while the overlay DNS resolves service names correctly. All containers are defined in the single docker-stack-db.prod.yml stack file at the repo root.
1. Firewall Update
Verify that the following rules exist in terraform/hetzner/prod/firewall.tf; if any are missing, add them and run terraform apply.
Inside hcloud_firewall.app, from the DB subnet to Swarm ports:
rule {
direction = "in"
protocol = "tcp"
port = "2377"
source_ips = [local.db_subnet_cidr]
description = "Docker Swarm control plane from DB subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "7946"
source_ips = [local.db_subnet_cidr]
description = "Docker Swarm node discovery (TCP) from DB subnet"
}
rule {
direction = "in"
protocol = "udp"
port = "7946"
source_ips = [local.db_subnet_cidr]
description = "Docker Swarm node discovery (UDP) from DB subnet"
}
rule {
direction = "in"
protocol = "udp"
port = "4789"
source_ips = [local.db_subnet_cidr]
description = "Docker Swarm VXLAN overlay from DB subnet"
}
Inside hcloud_firewall.db, from the app subnet to Swarm ports + overlay, and etcd/Patroni traffic inside the DB subnet:
rule {
direction = "in"
protocol = "tcp"
port = "2377"
source_ips = [local.app_subnet_cidr]
description = "Docker Swarm control plane from app subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "7946"
source_ips = [local.app_subnet_cidr]
description = "Docker Swarm node discovery (TCP) from app subnet"
}
rule {
direction = "in"
protocol = "udp"
port = "7946"
source_ips = [local.app_subnet_cidr]
description = "Docker Swarm node discovery (UDP) from app subnet"
}
rule {
direction = "in"
protocol = "udp"
port = "4789"
source_ips = [local.app_subnet_cidr]
description = "Docker Swarm VXLAN overlay from app subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "2379"
source_ips = [local.db_subnet_cidr]
description = "etcd client port within DB subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "2379"
source_ips = [local.app_subnet_cidr]
description = "etcd client port from app subnet (APISIX connects to Patroni etcd)"
}
rule {
direction = "in"
protocol = "tcp"
port = "2380"
source_ips = [local.db_subnet_cidr]
description = "etcd peer port within DB subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "8008"
source_ips = [local.db_subnet_cidr]
description = "Patroni REST API within DB subnet"
}
cd terraform/hetzner/prod
terraform plan
terraform apply
2. Add DB Nodes to Swarm
Swarm manager'lardan birinde (iklim-app-01) join token al:
docker swarm join-token worker
Her DB node'unda (iklim-db-01, iklim-db-02, iklim-db-03):
docker swarm join --token <TOKEN> 10.20.10.11:2377
Label the nodes on iklim-app-01:
docker node update --label-add role=db --label-add db-index=01 iklim-db-01
docker node update --label-add role=db --label-add db-index=02 iklim-db-02
docker node update --label-add role=db --label-add db-index=03 iklim-db-03
docker node ls
3. StorageBox Directory Structure
DB data and logs are stored on local Docker named volumes (performance, WAL/compaction requirements). Only config files are placed on StorageBox. On each DB node, where /mnt/storagebox must already be mounted:
# On iklim-db-01:
mkdir -p /mnt/storagebox/db/mongodb-01/config
mkdir -p /mnt/storagebox/db/postgresql-01/config
# On iklim-db-02:
mkdir -p /mnt/storagebox/db/mongodb-02/config
mkdir -p /mnt/storagebox/db/postgresql-02/config
# On iklim-db-03:
mkdir -p /mnt/storagebox/db/mongodb-03/config
mkdir -p /mnt/storagebox/db/postgresql-03/config
Config files (mongod.conf, patroni.yml) are deployed by the Ansible db_stack role into these directories. Named Docker volumes (mongodb-01-data, etcd-01-data, postgresql-01-data, etc.) are created automatically by the stack deploy.
4. MongoDB Replica Set
mongod.conf
Her DB node'unda /mnt/storagebox/db/mongodb-0X/config/mongod.conf (Ansible db_stack rolü tarafından deploy edilir):
net:
port: 27017
storage:
engine: "wiredTiger"
dbPath: "/data/db"
directoryPerDB: true
systemLog:
verbosity: 0
timeStampFormat: "iso8601-local"
destination: file
path: "/data/log/mongo.log"
logAppend: true
logRotate: rename
replication:
replSetName: "rs0"
security:
authorization: enabled
keyFile: "/data/configdb/rs-auth.key"
Replica Set Auth Key
The same key file must exist on all DB nodes:
# Create on iklim-db-01:
openssl rand -base64 756 > /mnt/storagebox/db/mongodb-01/config/rs-auth.key
chmod 400 /mnt/storagebox/db/mongodb-01/config/rs-auth.key
# Copy the same content to the other nodes:
cat /mnt/storagebox/db/mongodb-01/config/rs-auth.key \
> /mnt/storagebox/db/mongodb-02/config/rs-auth.key
cat /mnt/storagebox/db/mongodb-01/config/rs-auth.key \
> /mnt/storagebox/db/mongodb-03/config/rs-auth.key
chmod 400 /mnt/storagebox/db/mongodb-0{2,3}/config/rs-auth.key
Stack File — MongoDB
MongoDB services are defined in docker-stack-db.prod.yml (repo root). Each service uses a named Docker volume for data and log, and a StorageBox bind mount for config:
mongodb-01:
image: mongo:8.3.2
volumes:
- mongodb-01-data:/data/db
- mongodb-01-log:/data/log
- /mnt/storagebox/db/mongodb-01/config:/data/configdb
networks:
iklimco-net:
aliases:
- mongodb-01
ports:
- target: 27017
published: 27017
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
max_replicas_per_node: 1
constraints:
- node.hostname == iklim-db-01
Volumes mongodb-01-data, mongodb-01-log, etc. are declared at the bottom of docker-stack-db.prod.yml and are created automatically on first deploy.
Replica Set Initialization
Run once after the stack is deployed:
# On iklim-app-01 (overlay network erişimi için):
docker run --rm -it --network iklimco-net mongo:8.3.2 \
mongosh "mongodb://mongo-root:${DATABASE_MONGODB_ROOT_PASSWD}@mongodb-01/admin"
# Inside mongosh:
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "mongodb-01:27017", priority: 2 },
{ _id: 1, host: "mongodb-02:27017", priority: 1 },
{ _id: 2, host: "mongodb-03:27017", priority: 1 }
]
})
# Status check:
rs.status()
The replica set is ready when "stateStr": "PRIMARY" and two "SECONDARY" entries are visible.
5. PostgreSQL — Patroni + etcd
Patroni coordinates PostgreSQL primary/standby roles through etcd. If the primary goes down, one of the other nodes automatically wins the election and becomes primary. The Swarm service restarts the container; Patroni continues from where it left off.
5.1 Custom Image (Patroni + PostGIS)
Patroni is installed on top of the postgis/postgis:18-3.6 image. This image is pushed to Harbor and used in the stack.
build/patroni-postgis/Dockerfile:
FROM postgis/postgis:18-3.6
USER root
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-pip \
python3-dev \
gcc \
libpq-dev \
&& pip3 install --no-cache-dir 'patroni[etcd3]' \
&& apt-get purge -y gcc python3-dev \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/*
USER postgres
ENTRYPOINT ["patroni", "/etc/patroni/patroni.yml"]
Build and push is done with ops/push-harbor-custom-images.sh:
cd /path/to/repo
bash ops/push-harbor-custom-images.sh
Or manually:
cd build/patroni-postgis
docker build -t registry.tarla.io/iklimco/custom-patroni-postgis:18-3.6 .
echo "$HARBOR_CI_TOKEN" | docker login registry.tarla.io -u robot-ci-push-iklimco --password-stdin
docker push registry.tarla.io/iklimco/custom-patroni-postgis:18-3.6
5.2 etcd Cluster
etcd services are defined in docker-stack-db.prod.yml. Each service uses a named Docker volume for data and has an overlay DNS alias. Environment variables reference peer URLs by alias, not by hardcoded IP:
etcd-01:
image: bitnami/etcd:3
environment:
ALLOW_NONE_AUTHENTICATION: "yes"
ETCD_NAME: etcd-01
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://etcd-01:2380
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_ADVERTISE_CLIENT_URLS: http://etcd-01:2379
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_CLUSTER: "etcd-01=http://etcd-01:2380,etcd-02=http://etcd-02:2380,etcd-03=http://etcd-03:2380"
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
volumes:
- etcd-01-data:/bitnami/etcd/data
networks:
iklimco-net:
aliases:
- etcd-01
deploy:
replicas: 1
placement:
max_replicas_per_node: 1
constraints:
- node.hostname == iklim-db-01
APISIX etcd usage: In prod, APISIX shares this etcd cluster with the /apisix prefix. Patroni uses the /service/ prefix and APISIX uses the /apisix/ prefix — no collision. The overlay DNS names (etcd-01:2379, etcd-02:2379, etcd-03:2379) are reachable from app nodes via the iklimco-net overlay. Therefore, the app subnet → DB nodes port 2379 firewall rule is mandatory; it was added in Section 1.
Important: ETCD_INITIAL_CLUSTER_STATE must be new on the first deploy and existing on all later deploys. The deploy steps in Section 6 detect this automatically; no manual update is required.
5.3 Patroni Configuration
patroni.yml is generated per-node by the Ansible db_stack role from templates/patroni.yml.j2 using inventory_hostname (e.g., iklim-db-01). The generated file uses overlay DNS aliases for all addresses.
Generated output — Node 01 (/mnt/storagebox/db/postgresql-01/config/patroni.yml):
scope: iklim-postgres
namespace: /db/
name: postgresql-01
restapi:
listen: 0.0.0.0:8008
connect_address: patroni-01:8008
etcd3:
hosts:
- etcd-01:2379
- etcd-02:2379
- etcd-03:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
parameters:
wal_level: replica
hot_standby: "on"
wal_keep_size: 512
max_wal_senders: 5
max_replication_slots: 5
shared_preload_libraries: 'pg_stat_statements'
pg_stat_statements.track: 'all'
initdb:
- encoding: UTF8
- data-checksums
pg_hba:
- host replication replicator 10.20.20.0/24 scram-sha-256
- host all all 10.20.10.0/24 scram-sha-256
- host all all 10.20.20.0/24 scram-sha-256
users:
postgres:
password: "${DATABASE_POSTGRES_ROOT_PASSWD}"
options:
- superuser
postgresql:
listen: 0.0.0.0:5432
connect_address: patroni-01:5432
data_dir: /var/lib/postgresql/data/pgdata
pgpass: /tmp/pgpass0
authentication:
replication:
username: replicator
password: "${DATABASE_POSTGRES_REPLICATOR_PASSWORD}"
superuser:
username: postgres
password: "${DATABASE_POSTGRES_ROOT_PASSWD}"
parameters:
unix_socket_directories: "/var/run/postgresql"
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
For Node 02 and 03, only name, restapi.connect_address, and postgresql.connect_address differ (postgresql-02/patroni-02:8008/patroni-02:5432, etc.).
5.4 Stack File — Patroni
Patroni services are defined in docker-stack-db.prod.yml. Each service uses the custom image, a named Docker volume for data, a StorageBox bind mount for the config file, and overlay DNS aliases:
patroni-01:
image: registry.tarla.io/iklimco/custom-patroni-postgis:18-3.6
environment:
DATABASE_POSTGRES_ROOT_PASSWD: "${DATABASE_POSTGRES_ROOT_PASSWD}"
DATABASE_POSTGRES_REPLICATOR_PASSWORD: "${DATABASE_POSTGRES_REPLICATOR_PASSWORD}"
TZ: "Europe/Istanbul"
volumes:
- postgresql-01-data:/var/lib/postgresql/data
- /mnt/storagebox/db/postgresql-01/config/patroni.yml:/etc/patroni/patroni.yml:ro
networks:
iklimco-net:
aliases:
- patroni-01
ports:
- target: 5432
published: 5432
protocol: tcp
mode: host
- target: 8008
published: 8008
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
max_replicas_per_node: 1
constraints:
- node.hostname == iklim-db-01
Volumes postgresql-01-data, postgresql-02-data, postgresql-03-data are declared at the bottom of docker-stack-db.prod.yml and created automatically on first deploy.
5.5 Status Check
# On iklim-app-01 — Patroni cluster status:
docker exec -it $(docker ps -q -f name=iklim-db_patroni-01 | head -1) \
patronictl -c /etc/patroni/patroni.yml list
Expected output: one Leader row and two Replica rows, all with the State column set to running.
# etcd cluster health (from app node via overlay):
docker run --rm --network iklimco-net alpine \
sh -c "wget -qO- http://etcd-01:2379/health && \
wget -qO- http://etcd-02:2379/health && \
wget -qO- http://etcd-03:2379/health"
# Find the current primary:
docker exec -it $(docker ps -q -f name=iklim-db_patroni-01 | head -1) \
patronictl -c /etc/patroni/patroni.yml topology
6. Deploy
All DB services (etcd, MongoDB, Patroni) are in the single docker-stack-db.prod.yml stack. Deploy from iklim-app-01 in the repo working directory.
.env File
DB stack password variables (DATABASE_POSTGRES_ROOT_PASSWD, DATABASE_POSTGRES_REPLICATOR_PASSWORD, DATABASE_MONGODB_ROOT_PASSWD) are stored in prod/secrets/iklim.co/.env.secrets.shared on StorageBox. Fetch it to iklim-app-01 before deploy:
scp -P 23 STORAGEBOX_USER@STORAGEBOX_USER.your-storagebox.de:prod/secrets/iklim.co/.env.secrets.shared \
/tmp/.env.secrets.shared
chmod 600 /tmp/.env.secrets.shared
Deploy Steps
# On iklim-app-01, in the repo working directory:
set -a; . /tmp/.env.secrets.shared; set +a
# Automatic ETCD_INITIAL_CLUSTER_STATE detection:
DEPLOY_FILE="docker-stack-db.prod.yml"
if docker service ls --filter name=iklim-db_etcd-01 -q 2>/dev/null | grep -q .; then
echo "ℹ️ etcd services mevcut, 'existing' ile deploy ediliyor..."
DEPLOY_FILE=$(mktemp /tmp/docker-stack-db.XXXXXX.yml)
sed "s/ETCD_INITIAL_CLUSTER_STATE: new/ETCD_INITIAL_CLUSTER_STATE: existing/g" \
docker-stack-db.prod.yml > "$DEPLOY_FILE"
else
echo "ℹ️ İlk deploy, 'new' state kullanılıyor..."
fi
docker stack deploy \
--with-registry-auth \
-c "$DEPLOY_FILE" \
iklim-db
[ "$DEPLOY_FILE" != "docker-stack-db.prod.yml" ] && rm -f "$DEPLOY_FILE"
# Wait for etcd cluster to be ready:
echo "⏳ etcd bekleniyor..."
for i in $(seq 1 18); do
if docker run --rm --network iklimco-net alpine \
sh -c "wget -qO- http://etcd-01:2379/health 2>/dev/null | grep -q '\"health\":\"true\"'"; then
echo "✅ etcd ready"
break
fi
[ "$i" -eq 18 ] && echo "❌ etcd timeout" && exit 1
echo " attempt $i/18 — 10s bekleniyor..."
sleep 10
done
docker stack services iklim-db
DB Node Placement Check
docker service ps iklim-db_etcd-01
docker service ps iklim-db_mongodb-01
docker service ps iklim-db_patroni-01
All tasks must run on the expected iklim-db-* nodes.
MongoDB Replica Set Initialization
Run once after the stack is deployed:
# From iklim-app-01 via overlay network:
docker run --rm -it --network iklimco-net mongo:8.3.2 \
mongosh "mongodb://mongo-root:${DATABASE_MONGODB_ROOT_PASSWD}@mongodb-01/admin"
# Inside mongosh:
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "mongodb-01:27017", priority: 2 },
{ _id: 1, host: "mongodb-02:27017", priority: 1 },
{ _id: 2, host: "mongodb-03:27017", priority: 1 }
]
})
7. Access from App Services
App containers connect to DB services through the iklimco-net overlay network by overlay DNS name. Because the iklim-db stack shares the iklimco-net external network, service names and aliases are resolved through overlay DNS.
MongoDB Replica Set Connection String
Variables in env-prod/.env:
DATABASE_MONGODB_HOST=mongodb-01:27017,mongodb-02:27017,mongodb-03:27017
DATABASE_MONGODB_PARAMS=replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin
Microservice URI through overlay DNS:
mongodb://<user>:<password>@mongodb-01:27017,mongodb-02:27017,mongodb-03:27017/<db>?replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin
For direct testing, from outside the overlay with private IP:
mongodb://mongo-root:<PASSWORD>@10.20.20.11:27017,10.20.20.12:27017,10.20.20.13:27017/admin?replicaSet=rs0&authSource=admin
PostgreSQL — Patroni
Variables in env-prod/.env:
DATABASE_POSTGRES_HOST=patroni-01:5432,patroni-02:5432,patroni-03:5432
DATABASE_POSTGRES_PARAMS=targetServerType=preferSecondary&loadBalanceHosts=true
Patroni manages whichever node is primary at any moment. The JDBC/libpq driver automatically selects primary/secondary through the targetServerType parameter in the multi-host list:
# Write — goes to primary (libpq URI):
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=primary
# Read (load balancing):
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=preferSecondary&loadBalanceHosts=true
For direct testing, from outside the overlay with private IP:
postgresql://postgres@10.20.20.11:5432,10.20.20.12:5432,10.20.20.13:5432/postgres?targetServerType=primary
Patroni REST API
Patroni exposes an HTTP endpoint on port 8008. This endpoint can be used with HAProxy or a similar load balancer to route to the primary automatically:
# Primary check (HTTP 200 = primary, HTTP 503 = replica):
curl -s http://patroni-01:8008/primary
8. Geliştirici ve Ofis Erişimi (Production)
Prod cluster yapısında pg-proxy veya mongo-proxy kullanılmaz. Ofis bilgisayarından erişim için doğrudan DB subnet'i hedef alınır.
WireGuard Ayarı
Ofis bilgisayarındaki .conf dosyasında AllowedIPs güncellenmelidir:
AllowedIPs = 10.8.0.1/32, 10.20.20.0/24
Bağlantı Parametreleri (Multi-Host)
Modern veritabanı araçları (DBeaver, Compass vb.) küme farkındalıklı bağlantı kurmalıdır:
| Veritabanı | Host Listesi | Port | Kritik Parametre |
|---|---|---|---|
| PostgreSQL | 10.20.20.11, 10.20.20.12, 10.20.20.13 |
5432 |
targetServerType=primary |
| MongoDB | 10.20.20.11, 10.20.20.12, 10.20.20.13 |
27017 |
replicaSet=rs0 |
Acceptance Criteria
docker stack services iklim-db— 9 services visible (etcd-01/02/03, mongodb-01/02/03, patroni-01/02/03), all1/1docker service ps iklim-db_patroni-01/02/03— each task runs on its expectediklim-db-*nodedocker service ps iklim-db_mongodb-01/02/03— each task runs on its expectediklim-db-*nodedocker service ps iklim-db_etcd-01/02/03— each task runs on its expectediklim-db-*nodepatronictl list— 1Leader, 2Replica, allrunning- etcd health endpoint returns
"health":"true"on all three nodes via overlay rs.status()— 1 PRIMARY, 2 SECONDARY- MongoDB and PostgreSQL are reachable from app nodes.
- Ports
5432,27017,2379,2380, and8008are closed from the public internet. - When a DB node is restarted, Patroni performs automatic election and a new primary is selected.
- During Patroni primary transition, the old primary rejoins as standby; there is no split-brain.