Environment_Infrastructure/setup/08-prod-db-cluster-kurulum.md
Murat ÖZDEMİR 8780c7c05e docs(db): implement direct cluster access strategy for production
- Updated roadmap (03-infra-stack-changes.md) to deprecate database proxies in prod.
- Detailed direct subnet access via WireGuard for production developers.
- Provided multi-host connection parameters for Patroni and MongoDB Replica Sets in setup guide (08-prod-db-cluster-kurulum.md).
- Added environment comparison table to developer access guide.
2026-05-18 14:25:26 +03:00

918 lines
26 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 08 - Prod DB Cluster Setup (Swarm)
The purpose of this phase is to add the three DB nodes to Docker Swarm as workers and configure the MongoDB replica set and the PostgreSQL high-availability setup managed with Patroni + etcd.
`07-prod-ansible-bootstrap.md` must be completed on all DB nodes.
## Architecture
```
iklim-app-01/02/03 (Swarm manager'lar, 10.20.10.11/12/13)
|
|-- iklimco-net (overlay)
|
iklim-db-01 (Swarm worker, 10.20.20.11)
mongodb-01 [rs0 member 0 — preferred primary]
etcd-01 [etcd cluster member]
patroni-01 [Patroni + PostgreSQL — first primary candidate]
iklim-db-02 (Swarm worker, 10.20.20.12)
mongodb-02 [rs0 member 1]
etcd-02 [etcd cluster member]
patroni-02 [Patroni + PostgreSQL — standby]
iklim-db-03 (Swarm worker, 10.20.20.13)
mongodb-03 [rs0 member 2]
etcd-03 [etcd cluster member]
patroni-03 [Patroni + PostgreSQL — standby]
```
DB containers discover each other through **Hetzner private IPs**, not overlay DNS names. Therefore, each service publishes its port in `host` mode; replication and etcd traffic goes directly through the private network. The Hetzner Cloud firewall and the prod `db` firewall already allow these ports.
## 1. Firewall Update
Verify that the following rules exist in `terraform/hetzner/prod/firewall.tf`; if any are missing, add them and run `terraform apply`.
Inside `hcloud_firewall.swarm`, from the DB subnet to Swarm ports:
```hcl
rule {
direction = "in"
protocol = "tcp"
port = "2377"
source_ips = [local.db_subnet_cidr]
description = "Docker Swarm control plane from DB subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "7946"
source_ips = [local.db_subnet_cidr]
description = "Docker Swarm node discovery (TCP) from DB subnet"
}
rule {
direction = "in"
protocol = "udp"
port = "7946"
source_ips = [local.db_subnet_cidr]
description = "Docker Swarm node discovery (UDP) from DB subnet"
}
rule {
direction = "in"
protocol = "udp"
port = "4789"
source_ips = [local.db_subnet_cidr]
description = "Docker Swarm VXLAN overlay from DB subnet"
}
```
Inside `hcloud_firewall.db`, from the app subnet to Swarm ports + overlay, and etcd/Patroni traffic inside the DB subnet:
```hcl
rule {
direction = "in"
protocol = "tcp"
port = "2377"
source_ips = [local.app_subnet_cidr]
description = "Docker Swarm control plane from app subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "7946"
source_ips = [local.app_subnet_cidr]
description = "Docker Swarm node discovery (TCP) from app subnet"
}
rule {
direction = "in"
protocol = "udp"
port = "7946"
source_ips = [local.app_subnet_cidr]
description = "Docker Swarm node discovery (UDP) from app subnet"
}
rule {
direction = "in"
protocol = "udp"
port = "4789"
source_ips = [local.app_subnet_cidr]
description = "Docker Swarm VXLAN overlay from app subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "2379"
source_ips = [local.db_subnet_cidr]
description = "etcd client port within DB subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "2379"
source_ips = [local.app_subnet_cidr]
description = "etcd client port from app subnet (APISIX connects to Patroni etcd)"
}
rule {
direction = "in"
protocol = "tcp"
port = "2380"
source_ips = [local.db_subnet_cidr]
description = "etcd peer port within DB subnet"
}
rule {
direction = "in"
protocol = "tcp"
port = "8008"
source_ips = [local.db_subnet_cidr]
description = "Patroni REST API within DB subnet"
}
```
```bash
cd terraform/hetzner/prod
terraform plan
terraform apply
```
## 2. Add DB Nodes to Swarm
**Swarm manager'lardan birinde** (iklim-app-01) join token al:
```bash
docker swarm join-token worker
```
**Her DB node'unda** (iklim-db-01, iklim-db-02, iklim-db-03):
```bash
docker swarm join --token <TOKEN> 10.20.10.11:2377
```
Label the nodes **on iklim-app-01**:
```bash
docker node update --label-add role=db --label-add db-index=01 iklim-db-01
docker node update --label-add role=db --label-add db-index=02 iklim-db-02
docker node update --label-add role=db --label-add db-index=03 iklim-db-03
docker node ls
```
## 3. StorageBox Directory Structure
On each DB node, where `/mnt/storagebox` must already be mounted:
```bash
# On iklim-db-01:
mkdir -p /mnt/storagebox/prod/db/mongodb-01/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-01/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-01/data
# On iklim-db-02:
mkdir -p /mnt/storagebox/prod/db/mongodb-02/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-02/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-02/data
# On iklim-db-03:
mkdir -p /mnt/storagebox/prod/db/mongodb-03/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-03/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-03/data
```
## 4. MongoDB Replica Set
### mongod.conf
Her DB node'unda `/mnt/storagebox/prod/db/mongodb-0X/config/mongod.conf`:
```yaml
net:
port: 27017
storage:
engine: "wiredTiger"
dbPath: "/data/db"
directoryPerDB: true
systemLog:
verbosity: 0
timeStampFormat: "iso8601-local"
destination: file
path: "/data/log/mongo.log"
logAppend: true
logRotate: rename
replication:
replSetName: "rs0"
security:
authorization: enabled
keyFile: "/data/configdb/rs-auth.key"
```
### Replica Set Auth Key
The **same** key file must exist on all DB nodes:
```bash
# Create on iklim-db-01:
openssl rand -base64 756 > /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key
chmod 400 /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key
# Copy the same content to the other nodes:
cat /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key \
> /mnt/storagebox/prod/db/mongodb-02/config/rs-auth.key
cat /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key \
> /mnt/storagebox/prod/db/mongodb-03/config/rs-auth.key
chmod 400 /mnt/storagebox/prod/db/mongodb-0{2,3}/config/rs-auth.key
```
### Stack File — MongoDB
`/opt/iklimco/stacks/prod-db-mongo.yml`:
```yaml
version: "3.8"
networks:
iklimco-net:
external: true
services:
mongodb-01:
image: mongo:8
environment:
MONGO_INITDB_ROOT_USERNAME: mongo-root
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
volumes:
- /mnt/storagebox/prod/db/mongodb-01/data:/data/db
- /mnt/storagebox/prod/db/mongodb-01/log:/data/log
- /mnt/storagebox/prod/db/mongodb-01/config:/data/configdb
networks:
- iklimco-net
ports:
- target: 27017
published: 27017
protocol: tcp
mode: host
command: ["--config", "/data/configdb/mongod.conf"]
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-01
restart_policy:
condition: on-failure
mongodb-02:
image: mongo:8
environment:
MONGO_INITDB_ROOT_USERNAME: mongo-root
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
volumes:
- /mnt/storagebox/prod/db/mongodb-02/data:/data/db
- /mnt/storagebox/prod/db/mongodb-02/log:/data/log
- /mnt/storagebox/prod/db/mongodb-02/config:/data/configdb
networks:
- iklimco-net
ports:
- target: 27017
published: 27017
protocol: tcp
mode: host
command: ["--config", "/data/configdb/mongod.conf"]
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-02
restart_policy:
condition: on-failure
mongodb-03:
image: mongo:8
environment:
MONGO_INITDB_ROOT_USERNAME: mongo-root
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
volumes:
- /mnt/storagebox/prod/db/mongodb-03/data:/data/db
- /mnt/storagebox/prod/db/mongodb-03/log:/data/log
- /mnt/storagebox/prod/db/mongodb-03/config:/data/configdb
networks:
- iklimco-net
ports:
- target: 27017
published: 27017
protocol: tcp
mode: host
command: ["--config", "/data/configdb/mongod.conf"]
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-03
restart_policy:
condition: on-failure
```
### Replica Set Initialization
Run **once** after the stack is deployed:
```bash
# On iklim-db-01:
docker exec -it $(docker ps -q -f name=iklim-db_mongodb-01) mongosh \
-u mongo-root -p "${MONGO_ROOT_PASSWORD}" --authenticationDatabase admin
# Inside mongosh:
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "10.20.20.11:27017", priority: 2 },
{ _id: 1, host: "10.20.20.12:27017", priority: 1 },
{ _id: 2, host: "10.20.20.13:27017", priority: 1 }
]
})
# Status check:
rs.status()
```
The replica set is ready when `"stateStr": "PRIMARY"` and two `"SECONDARY"` entries are visible.
## 5. PostgreSQL — Patroni + etcd
Patroni coordinates PostgreSQL primary/standby roles through etcd. If the primary goes down, one of the other nodes automatically wins the election and becomes primary. The Swarm service restarts the container; Patroni continues from where it left off.
### 5.1 Custom Image (Patroni + PostGIS)
Patroni is installed on top of the `postgis/postgis:17-3.5` image. This image is pushed to Harbor and used in the stack.
`Environment_Infrastructure/docker/patroni-postgis/Dockerfile`:
```dockerfile
FROM postgis/postgis:17-3.5
USER root
RUN apt-get update && apt-get install -y --no-install-recommends \
python3-pip \
python3-dev \
gcc \
libpq-dev \
&& pip3 install --no-cache-dir 'patroni[etcd3]' \
&& apt-get purge -y gcc python3-dev \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/*
USER postgres
ENTRYPOINT ["patroni", "/etc/patroni/patroni.yml"]
```
Build and push; this is done with `ops/push-harbor-custom-images.sh`, or run the commands below:
```bash
cd Environment_Infrastructure/docker/patroni-postgis
docker build -t registry.tarla.io/iklimco/patroni-postgis:17-3.5 .
echo "$HARBOR_CI_TOKEN" | docker login registry.tarla.io -u robot-ci-push-iklimco --password-stdin
docker push registry.tarla.io/iklimco/patroni-postgis:17-3.5
```
### 5.2 etcd Cluster
#### Stack File — etcd
`/opt/iklimco/stacks/prod-db-etcd.yml`:
```yaml
version: "3.8"
networks:
iklimco-net:
external: true
services:
etcd-01:
image: bitnami/etcd:3
environment:
ALLOW_NONE_AUTHENTICATION: "yes"
ETCD_NAME: etcd-01
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.11:2380
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.11:2379
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
volumes:
- /mnt/storagebox/prod/db/etcd-01/data:/bitnami/etcd/data
networks:
- iklimco-net
ports:
- target: 2379
published: 2379
protocol: tcp
mode: host
- target: 2380
published: 2380
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-01
restart_policy:
condition: on-failure
etcd-02:
image: bitnami/etcd:3
environment:
ALLOW_NONE_AUTHENTICATION: "yes"
ETCD_NAME: etcd-02
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.12:2380
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.12:2379
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
volumes:
- /mnt/storagebox/prod/db/etcd-02/data:/bitnami/etcd/data
networks:
- iklimco-net
ports:
- target: 2379
published: 2379
protocol: tcp
mode: host
- target: 2380
published: 2380
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-02
restart_policy:
condition: on-failure
etcd-03:
image: bitnami/etcd:3
environment:
ALLOW_NONE_AUTHENTICATION: "yes"
ETCD_NAME: etcd-03
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.13:2380
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.13:2379
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
volumes:
- /mnt/storagebox/prod/db/etcd-03/data:/bitnami/etcd/data
networks:
- iklimco-net
ports:
- target: 2379
published: 2379
protocol: tcp
mode: host
- target: 2380
published: 2380
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-03
restart_policy:
condition: on-failure
```
**APISIX etcd usage:** In prod, APISIX shares this etcd cluster with the `/apisix` prefix. Patroni uses the `/service/` prefix and APISIX uses the `/apisix/` prefix, so there is no collision. APISIX configuration is managed by the `config.yaml` file in the `docker-stack-infra.prod.yml` overlay; the connection is made to `http://iklim-db-01:2379,http://iklim-db-02:2379,http://iklim-db-03:2379`. Therefore, the app subnet -> DB nodes port 2379 firewall rule is mandatory; it was added in Section 1.
**Important:** `ETCD_INITIAL_CLUSTER_STATE` must be `new` on the first deploy and `existing` on all later deploys. If the wrong value is left in place, the data directory is reset. The deploy steps in Section 6 below detect this automatically; no manual update is required.
### 5.3 Patroni Configuration
A separate `patroni.yml` file is created for each node. The only differences are the `name` and `connect_address` fields.
**Node 01**`/mnt/storagebox/prod/db/postgresql-01/config/patroni.yml`:
```yaml
scope: iklim-postgres
namespace: /db/
name: postgresql-01
restapi:
listen: 0.0.0.0:8008
connect_address: 10.20.20.11:8008
etcd3:
hosts:
- 10.20.20.11:2379
- 10.20.20.12:2379
- 10.20.20.13:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
parameters:
wal_level: replica
hot_standby: "on"
wal_keep_size: 512
max_wal_senders: 5
max_replication_slots: 5
shared_preload_libraries: 'pg_stat_statements'
pg_stat_statements.track: 'all'
initdb:
- encoding: UTF8
- data-checksums
pg_hba:
- host replication replicator 10.20.20.0/24 scram-sha-256
- host all all 10.20.10.0/24 scram-sha-256
- host all all 10.20.20.0/24 scram-sha-256
users:
postgres:
password: "${POSTGRES_PASSWORD}"
options:
- superuser
postgresql:
listen: 0.0.0.0:5432
connect_address: 10.20.20.11:5432
data_dir: /var/lib/postgresql/data/pgdata
pgpass: /tmp/pgpass0
authentication:
replication:
username: replicator
password: "${REPLICATOR_PASSWORD}"
superuser:
username: postgres
password: "${POSTGRES_PASSWORD}"
parameters:
unix_socket_directories: "/var/run/postgresql"
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
```
**Node 02**`/mnt/storagebox/prod/db/postgresql-02/config/patroni.yml`:
Same content as Node 01; only the following fields differ:
```yaml
name: postgresql-02
restapi:
connect_address: 10.20.20.12:8008
postgresql:
connect_address: 10.20.20.12:5432
data_dir: /var/lib/postgresql/data/pgdata
```
**Node 03**`/mnt/storagebox/prod/db/postgresql-03/config/patroni.yml`:
```yaml
name: postgresql-03
restapi:
connect_address: 10.20.20.13:8008
postgresql:
connect_address: 10.20.20.13:5432
data_dir: /var/lib/postgresql/data/pgdata
```
### 5.4 Stack File — Patroni
`/opt/iklimco/stacks/prod-db-patroni.yml`:
```yaml
version: "3.8"
networks:
iklimco-net:
external: true
services:
patroni-01:
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
environment:
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
TZ: "Europe/Istanbul"
volumes:
- /mnt/storagebox/prod/db/postgresql-01/data:/var/lib/postgresql/data
- /mnt/storagebox/prod/db/postgresql-01/config/patroni.yml:/etc/patroni/patroni.yml:ro
networks:
- iklimco-net
ports:
- target: 5432
published: 5432
protocol: tcp
mode: host
- target: 8008
published: 8008
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-01
restart_policy:
condition: on-failure
patroni-02:
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
environment:
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
TZ: "Europe/Istanbul"
volumes:
- /mnt/storagebox/prod/db/postgresql-02/data:/var/lib/postgresql/data
- /mnt/storagebox/prod/db/postgresql-02/config/patroni.yml:/etc/patroni/patroni.yml:ro
networks:
- iklimco-net
ports:
- target: 5432
published: 5432
protocol: tcp
mode: host
- target: 8008
published: 8008
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-02
restart_policy:
condition: on-failure
patroni-03:
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
environment:
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
TZ: "Europe/Istanbul"
volumes:
- /mnt/storagebox/prod/db/postgresql-03/data:/var/lib/postgresql/data
- /mnt/storagebox/prod/db/postgresql-03/config/patroni.yml:/etc/patroni/patroni.yml:ro
networks:
- iklimco-net
ports:
- target: 5432
published: 5432
protocol: tcp
mode: host
- target: 8008
published: 8008
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-03
restart_policy:
condition: on-failure
```
### 5.5 Status Check
```bash
# On any DB node:
docker exec -it $(docker ps -q -f name=iklim-patroni_patroni-01) \
patronictl -c /etc/patroni/patroni.yml list
```
Expected output: one `Leader` row and two `Replica` rows, all with the `State` column set to `running`.
```bash
# etcd cluster health:
docker exec -it $(docker ps -q -f name=iklim-etcd_etcd-01) \
etcdctl endpoint health \
--endpoints=http://10.20.20.11:2379,http://10.20.20.12:2379,http://10.20.20.13:2379
```
```bash
# Find the current primary:
docker exec -it $(docker ps -q -f name=iklim-patroni_patroni-01) \
patronictl -c /etc/patroni/patroni.yml topology
```
## 6. Deploy
Order matters: etcd first, then the MongoDB and Patroni stacks.
### .env File
The `/opt/iklimco/stacks/.env` file is stored on StorageBox as `prod/secrets/iklim.co/.env.stacks`. When it is created the first time, it is filled with strong passwords and uploaded to StorageBox; later deploys fetch it from there:
```bash
# On iklim-app-01, once:
scp -P 23 STORAGEBOX_USER@STORAGEBOX_USER.your-storagebox.de:prod/secrets/iklim.co/.env.stacks \
/opt/iklimco/stacks/.env
chmod 600 /opt/iklimco/stacks/.env
```
File content (`/opt/iklimco/stacks/.env`, not committed to the repo):
```env
DATABASE_POSTGRES_ROOT_USER=postgres
POSTGRES_PASSWORD=<strong-password>
REPLICATOR_PASSWORD=<strong-password>
MONGO_ROOT_PASSWORD=<strong-password>
```
### Deploy Steps
```bash
# On iklim-app-01 (Swarm manager):
export $(cat /opt/iklimco/stacks/.env | xargs)
# Automatic ETCD_INITIAL_CLUSTER_STATE detection — 'new' on first deploy, 'existing' afterwards
ETCD_STATE="new"
if docker service ls --filter name=iklim-etcd -q 2>/dev/null | grep -q .; then
echo " etcd services exist, using 'existing' state..."
ETCD_STATE="existing"
else
echo " First deploy, using 'new' state..."
fi
sed -i \
"s/ETCD_INITIAL_CLUSTER_STATE: new/ETCD_INITIAL_CLUSTER_STATE: ${ETCD_STATE}/g; \
s/ETCD_INITIAL_CLUSTER_STATE: existing/ETCD_INITIAL_CLUSTER_STATE: ${ETCD_STATE}/g" \
/opt/iklimco/stacks/prod-db-etcd.yml
echo "✅ ETCD_INITIAL_CLUSTER_STATE=${ETCD_STATE}"
# 1. etcd cluster:
docker stack deploy \
--compose-file /opt/iklimco/stacks/prod-db-etcd.yml \
--with-registry-auth \
iklim-etcd
# Wait for the etcd cluster to be ready:
echo "⏳ etcd bekleniyor..."
for i in $(seq 1 18); do
if docker exec $(docker ps -q -f name=iklim-etcd_etcd-01 | head -1) \
etcdctl endpoint health \
--endpoints=http://10.20.20.11:2379,http://10.20.20.12:2379,http://10.20.20.13:2379 \
2>/dev/null | grep -q "is healthy"; then
echo "✅ etcd ready"
break
fi
[ "$i" -eq 18 ] && echo "❌ etcd timeout" && exit 1
echo " attempt $i/18 — 10s bekleniyor..."
sleep 10
done
# 2. MongoDB:
docker stack deploy \
--compose-file /opt/iklimco/stacks/prod-db-mongo.yml \
--with-registry-auth \
iklim-db
# 3. Patroni (PostgreSQL):
docker stack deploy \
--compose-file /opt/iklimco/stacks/prod-db-patroni.yml \
--with-registry-auth \
iklim-patroni
docker stack services iklim-etcd
docker stack services iklim-db
docker stack services iklim-patroni
```
### MongoDB Replica Set Initialization
Run once after the MongoDB stack is deployed:
```bash
docker exec -it $(docker ps -q -f name=iklim-db_mongodb-01) mongosh \
-u mongo-root -p "${MONGO_ROOT_PASSWORD}" --authenticationDatabase admin
# Inside mongosh:
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "10.20.20.11:27017", priority: 2 },
{ _id: 1, host: "10.20.20.12:27017", priority: 1 },
{ _id: 2, host: "10.20.20.13:27017", priority: 1 }
]
})
```
## 7. Access from App Services
App containers connect to DB services through the `iklimco-net` overlay network **by Swarm DNS name**. Because the MongoDB stack (`iklim-db`) and Patroni stack (`iklim-patroni`) share the `iklimco-net` external network, service names are resolved through overlay DNS.
### MongoDB Replica Set Connection String
Variables in `env-prod/.env`:
```bash
DATABASE_MONGODB_HOST=mongodb-01:27017,mongodb-02:27017,mongodb-03:27017
DATABASE_MONGODB_PARAMS=replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin
```
Microservice URI through overlay DNS:
```
mongodb://<user>:<password>@mongodb-01:27017,mongodb-02:27017,mongodb-03:27017/<db>?replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin
```
> For direct testing, from outside the overlay with private IP:
> `mongodb://mongo-root:<PASSWORD>@10.20.20.11:27017,10.20.20.12:27017,10.20.20.13:27017/admin?replicaSet=rs0&authSource=admin`
### PostgreSQL — Patroni
Variables in `env-prod/.env`:
```bash
DATABASE_POSTGRES_HOST=patroni-01:5432,patroni-02:5432,patroni-03:5432
DATABASE_POSTGRES_PARAMS=targetServerType=preferSecondary&loadBalanceHosts=true
```
Patroni manages whichever node is primary at any moment. The JDBC/libpq driver automatically selects primary/secondary through the `targetServerType` parameter in the multi-host list:
```
# Write — goes to primary (libpq URI):
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=primary
# Read (load balancing):
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=preferSecondary&loadBalanceHosts=true
```
> For direct testing, from outside the overlay with private IP:
> `postgresql://postgres@10.20.20.11:5432,10.20.20.12:5432,10.20.20.13:5432/postgres?targetServerType=primary`
The PostgreSQL JDBC/libpq driver connects to all listed nodes with `targetServerType=primary` and automatically finds the primary.
### Patroni REST API
Patroni exposes an HTTP endpoint on port 8008. This endpoint can be used with HAProxy or a similar load balancer to route to the primary automatically:
```bash
# Primary check (HTTP 200 = primary, HTTP 503 = replica):
curl -s http://10.20.20.11:8008/primary
```
## 8. Geliştirici ve Ofis Erişimi (Production)
Prod cluster yapısında `pg-proxy` veya `mongo-proxy` **kullanılmaz**. Ofis bilgisayarından erişim için doğrudan DB subnet'i hedef alınır.
### WireGuard Ayarı
Ofis bilgisayarındaki `.conf` dosyasında `AllowedIPs` güncellenmelidir:
`AllowedIPs = 10.8.0.1/32, 10.20.20.0/24`
### Bağlantı Parametreleri (Multi-Host)
Modern veritabanı araçları (DBeaver, Compass vb.) küme farkındalıklı bağlantı kurmalıdır:
| Veritabanı | Host Listesi | Port | Kritik Parametre |
| :--- | :--- | :--- | :--- |
| **PostgreSQL** | `10.20.20.11, 10.20.20.12, 10.20.20.13` | `5432` | `targetServerType=primary` |
| **MongoDB** | `10.20.20.11, 10.20.20.12, 10.20.20.13` | `27017` | `replicaSet=rs0` |
## Acceptance Criteria
- `docker stack services iklim-etcd` — three services `1/1`
- `docker stack services iklim-db` — three MongoDB services `1/1`
- `docker stack services iklim-patroni` — three Patroni services `1/1`
- In the output of `docker service ps iklim-patroni_patroni-01`, `patroni-02`, and `patroni-03`, every task runs on an `iklim-db-*` node through the `role=db` placement constraint.
- In the output of `docker service ps iklim-db_mongodb-01`, `mongodb-02`, and `mongodb-03`, every task runs on an `iklim-db-*` node.
- In the output of `docker service ps iklim-etcd_etcd-01`, `etcd-02`, and `etcd-03`, every task runs on an `iklim-db-*` node.
- `patronictl list` — 1 `Leader`, 2 `Replica`, all `running`
- `etcdctl endpoint health` — three endpoints `healthy`
- `rs.status()` — 1 PRIMARY, 2 SECONDARY
- MongoDB and PostgreSQL are reachable from app nodes.
- Ports `5432`, `27017`, `2379`, `2380`, and `8008` are closed from the public internet.
- When a DB node is restarted, Patroni performs automatic election and a new primary is selected.
- During Patroni primary transition, the old primary rejoins as standby; there is no split-brain.