- Updated roadmap (03-infra-stack-changes.md) to deprecate database proxies in prod. - Detailed direct subnet access via WireGuard for production developers. - Provided multi-host connection parameters for Patroni and MongoDB Replica Sets in setup guide (08-prod-db-cluster-kurulum.md). - Added environment comparison table to developer access guide.
918 lines
26 KiB
Markdown
918 lines
26 KiB
Markdown
# 08 - Prod DB Cluster Setup (Swarm)
|
||
|
||
The purpose of this phase is to add the three DB nodes to Docker Swarm as workers and configure the MongoDB replica set and the PostgreSQL high-availability setup managed with Patroni + etcd.
|
||
|
||
`07-prod-ansible-bootstrap.md` must be completed on all DB nodes.
|
||
|
||
## Architecture
|
||
|
||
```
|
||
iklim-app-01/02/03 (Swarm manager'lar, 10.20.10.11/12/13)
|
||
|
|
||
|-- iklimco-net (overlay)
|
||
|
|
||
iklim-db-01 (Swarm worker, 10.20.20.11)
|
||
mongodb-01 [rs0 member 0 — preferred primary]
|
||
etcd-01 [etcd cluster member]
|
||
patroni-01 [Patroni + PostgreSQL — first primary candidate]
|
||
|
||
iklim-db-02 (Swarm worker, 10.20.20.12)
|
||
mongodb-02 [rs0 member 1]
|
||
etcd-02 [etcd cluster member]
|
||
patroni-02 [Patroni + PostgreSQL — standby]
|
||
|
||
iklim-db-03 (Swarm worker, 10.20.20.13)
|
||
mongodb-03 [rs0 member 2]
|
||
etcd-03 [etcd cluster member]
|
||
patroni-03 [Patroni + PostgreSQL — standby]
|
||
```
|
||
|
||
DB containers discover each other through **Hetzner private IPs**, not overlay DNS names. Therefore, each service publishes its port in `host` mode; replication and etcd traffic goes directly through the private network. The Hetzner Cloud firewall and the prod `db` firewall already allow these ports.
|
||
|
||
## 1. Firewall Update
|
||
|
||
Verify that the following rules exist in `terraform/hetzner/prod/firewall.tf`; if any are missing, add them and run `terraform apply`.
|
||
|
||
Inside `hcloud_firewall.swarm`, from the DB subnet to Swarm ports:
|
||
|
||
```hcl
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2377"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Docker Swarm control plane from DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "7946"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Docker Swarm node discovery (TCP) from DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "udp"
|
||
port = "7946"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Docker Swarm node discovery (UDP) from DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "udp"
|
||
port = "4789"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Docker Swarm VXLAN overlay from DB subnet"
|
||
}
|
||
```
|
||
|
||
Inside `hcloud_firewall.db`, from the app subnet to Swarm ports + overlay, and etcd/Patroni traffic inside the DB subnet:
|
||
|
||
```hcl
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2377"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "Docker Swarm control plane from app subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "7946"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "Docker Swarm node discovery (TCP) from app subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "udp"
|
||
port = "7946"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "Docker Swarm node discovery (UDP) from app subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "udp"
|
||
port = "4789"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "Docker Swarm VXLAN overlay from app subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2379"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "etcd client port within DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2379"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "etcd client port from app subnet (APISIX connects to Patroni etcd)"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2380"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "etcd peer port within DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "8008"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Patroni REST API within DB subnet"
|
||
}
|
||
```
|
||
|
||
```bash
|
||
cd terraform/hetzner/prod
|
||
terraform plan
|
||
terraform apply
|
||
```
|
||
|
||
## 2. Add DB Nodes to Swarm
|
||
|
||
**Swarm manager'lardan birinde** (iklim-app-01) join token al:
|
||
|
||
```bash
|
||
docker swarm join-token worker
|
||
```
|
||
|
||
**Her DB node'unda** (iklim-db-01, iklim-db-02, iklim-db-03):
|
||
|
||
```bash
|
||
docker swarm join --token <TOKEN> 10.20.10.11:2377
|
||
```
|
||
|
||
Label the nodes **on iklim-app-01**:
|
||
|
||
```bash
|
||
docker node update --label-add role=db --label-add db-index=01 iklim-db-01
|
||
docker node update --label-add role=db --label-add db-index=02 iklim-db-02
|
||
docker node update --label-add role=db --label-add db-index=03 iklim-db-03
|
||
|
||
docker node ls
|
||
```
|
||
|
||
## 3. StorageBox Directory Structure
|
||
|
||
On each DB node, where `/mnt/storagebox` must already be mounted:
|
||
|
||
```bash
|
||
# On iklim-db-01:
|
||
mkdir -p /mnt/storagebox/prod/db/mongodb-01/{data,log,config}
|
||
mkdir -p /mnt/storagebox/prod/db/postgresql-01/{data,config}
|
||
mkdir -p /mnt/storagebox/prod/db/etcd-01/data
|
||
|
||
# On iklim-db-02:
|
||
mkdir -p /mnt/storagebox/prod/db/mongodb-02/{data,log,config}
|
||
mkdir -p /mnt/storagebox/prod/db/postgresql-02/{data,config}
|
||
mkdir -p /mnt/storagebox/prod/db/etcd-02/data
|
||
|
||
# On iklim-db-03:
|
||
mkdir -p /mnt/storagebox/prod/db/mongodb-03/{data,log,config}
|
||
mkdir -p /mnt/storagebox/prod/db/postgresql-03/{data,config}
|
||
mkdir -p /mnt/storagebox/prod/db/etcd-03/data
|
||
```
|
||
|
||
## 4. MongoDB Replica Set
|
||
|
||
### mongod.conf
|
||
|
||
Her DB node'unda `/mnt/storagebox/prod/db/mongodb-0X/config/mongod.conf`:
|
||
|
||
```yaml
|
||
net:
|
||
port: 27017
|
||
storage:
|
||
engine: "wiredTiger"
|
||
dbPath: "/data/db"
|
||
directoryPerDB: true
|
||
systemLog:
|
||
verbosity: 0
|
||
timeStampFormat: "iso8601-local"
|
||
destination: file
|
||
path: "/data/log/mongo.log"
|
||
logAppend: true
|
||
logRotate: rename
|
||
replication:
|
||
replSetName: "rs0"
|
||
security:
|
||
authorization: enabled
|
||
keyFile: "/data/configdb/rs-auth.key"
|
||
```
|
||
|
||
### Replica Set Auth Key
|
||
|
||
The **same** key file must exist on all DB nodes:
|
||
|
||
```bash
|
||
# Create on iklim-db-01:
|
||
openssl rand -base64 756 > /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key
|
||
chmod 400 /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key
|
||
|
||
# Copy the same content to the other nodes:
|
||
cat /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key \
|
||
> /mnt/storagebox/prod/db/mongodb-02/config/rs-auth.key
|
||
cat /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key \
|
||
> /mnt/storagebox/prod/db/mongodb-03/config/rs-auth.key
|
||
|
||
chmod 400 /mnt/storagebox/prod/db/mongodb-0{2,3}/config/rs-auth.key
|
||
```
|
||
|
||
### Stack File — MongoDB
|
||
|
||
`/opt/iklimco/stacks/prod-db-mongo.yml`:
|
||
|
||
```yaml
|
||
version: "3.8"
|
||
|
||
networks:
|
||
iklimco-net:
|
||
external: true
|
||
|
||
services:
|
||
mongodb-01:
|
||
image: mongo:8
|
||
environment:
|
||
MONGO_INITDB_ROOT_USERNAME: mongo-root
|
||
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/mongodb-01/data:/data/db
|
||
- /mnt/storagebox/prod/db/mongodb-01/log:/data/log
|
||
- /mnt/storagebox/prod/db/mongodb-01/config:/data/configdb
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 27017
|
||
published: 27017
|
||
protocol: tcp
|
||
mode: host
|
||
command: ["--config", "/data/configdb/mongod.conf"]
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-01
|
||
restart_policy:
|
||
condition: on-failure
|
||
|
||
mongodb-02:
|
||
image: mongo:8
|
||
environment:
|
||
MONGO_INITDB_ROOT_USERNAME: mongo-root
|
||
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/mongodb-02/data:/data/db
|
||
- /mnt/storagebox/prod/db/mongodb-02/log:/data/log
|
||
- /mnt/storagebox/prod/db/mongodb-02/config:/data/configdb
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 27017
|
||
published: 27017
|
||
protocol: tcp
|
||
mode: host
|
||
command: ["--config", "/data/configdb/mongod.conf"]
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-02
|
||
restart_policy:
|
||
condition: on-failure
|
||
|
||
mongodb-03:
|
||
image: mongo:8
|
||
environment:
|
||
MONGO_INITDB_ROOT_USERNAME: mongo-root
|
||
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/mongodb-03/data:/data/db
|
||
- /mnt/storagebox/prod/db/mongodb-03/log:/data/log
|
||
- /mnt/storagebox/prod/db/mongodb-03/config:/data/configdb
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 27017
|
||
published: 27017
|
||
protocol: tcp
|
||
mode: host
|
||
command: ["--config", "/data/configdb/mongod.conf"]
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-03
|
||
restart_policy:
|
||
condition: on-failure
|
||
```
|
||
|
||
### Replica Set Initialization
|
||
|
||
Run **once** after the stack is deployed:
|
||
|
||
```bash
|
||
# On iklim-db-01:
|
||
docker exec -it $(docker ps -q -f name=iklim-db_mongodb-01) mongosh \
|
||
-u mongo-root -p "${MONGO_ROOT_PASSWORD}" --authenticationDatabase admin
|
||
|
||
# Inside mongosh:
|
||
rs.initiate({
|
||
_id: "rs0",
|
||
members: [
|
||
{ _id: 0, host: "10.20.20.11:27017", priority: 2 },
|
||
{ _id: 1, host: "10.20.20.12:27017", priority: 1 },
|
||
{ _id: 2, host: "10.20.20.13:27017", priority: 1 }
|
||
]
|
||
})
|
||
|
||
# Status check:
|
||
rs.status()
|
||
```
|
||
|
||
The replica set is ready when `"stateStr": "PRIMARY"` and two `"SECONDARY"` entries are visible.
|
||
|
||
## 5. PostgreSQL — Patroni + etcd
|
||
|
||
Patroni coordinates PostgreSQL primary/standby roles through etcd. If the primary goes down, one of the other nodes automatically wins the election and becomes primary. The Swarm service restarts the container; Patroni continues from where it left off.
|
||
|
||
### 5.1 Custom Image (Patroni + PostGIS)
|
||
|
||
Patroni is installed on top of the `postgis/postgis:17-3.5` image. This image is pushed to Harbor and used in the stack.
|
||
|
||
`Environment_Infrastructure/docker/patroni-postgis/Dockerfile`:
|
||
|
||
```dockerfile
|
||
FROM postgis/postgis:17-3.5
|
||
|
||
USER root
|
||
|
||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||
python3-pip \
|
||
python3-dev \
|
||
gcc \
|
||
libpq-dev \
|
||
&& pip3 install --no-cache-dir 'patroni[etcd3]' \
|
||
&& apt-get purge -y gcc python3-dev \
|
||
&& apt-get autoremove -y \
|
||
&& rm -rf /var/lib/apt/lists/*
|
||
|
||
USER postgres
|
||
|
||
ENTRYPOINT ["patroni", "/etc/patroni/patroni.yml"]
|
||
```
|
||
|
||
Build and push; this is done with `ops/push-harbor-custom-images.sh`, or run the commands below:
|
||
|
||
```bash
|
||
cd Environment_Infrastructure/docker/patroni-postgis
|
||
docker build -t registry.tarla.io/iklimco/patroni-postgis:17-3.5 .
|
||
echo "$HARBOR_CI_TOKEN" | docker login registry.tarla.io -u robot-ci-push-iklimco --password-stdin
|
||
docker push registry.tarla.io/iklimco/patroni-postgis:17-3.5
|
||
```
|
||
|
||
### 5.2 etcd Cluster
|
||
|
||
#### Stack File — etcd
|
||
|
||
`/opt/iklimco/stacks/prod-db-etcd.yml`:
|
||
|
||
```yaml
|
||
version: "3.8"
|
||
|
||
networks:
|
||
iklimco-net:
|
||
external: true
|
||
|
||
services:
|
||
etcd-01:
|
||
image: bitnami/etcd:3
|
||
environment:
|
||
ALLOW_NONE_AUTHENTICATION: "yes"
|
||
ETCD_NAME: etcd-01
|
||
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.11:2380
|
||
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
|
||
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.11:2379
|
||
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
|
||
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
|
||
ETCD_INITIAL_CLUSTER_STATE: new
|
||
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/etcd-01/data:/bitnami/etcd/data
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 2379
|
||
published: 2379
|
||
protocol: tcp
|
||
mode: host
|
||
- target: 2380
|
||
published: 2380
|
||
protocol: tcp
|
||
mode: host
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-01
|
||
restart_policy:
|
||
condition: on-failure
|
||
|
||
etcd-02:
|
||
image: bitnami/etcd:3
|
||
environment:
|
||
ALLOW_NONE_AUTHENTICATION: "yes"
|
||
ETCD_NAME: etcd-02
|
||
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.12:2380
|
||
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
|
||
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.12:2379
|
||
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
|
||
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
|
||
ETCD_INITIAL_CLUSTER_STATE: new
|
||
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/etcd-02/data:/bitnami/etcd/data
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 2379
|
||
published: 2379
|
||
protocol: tcp
|
||
mode: host
|
||
- target: 2380
|
||
published: 2380
|
||
protocol: tcp
|
||
mode: host
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-02
|
||
restart_policy:
|
||
condition: on-failure
|
||
|
||
etcd-03:
|
||
image: bitnami/etcd:3
|
||
environment:
|
||
ALLOW_NONE_AUTHENTICATION: "yes"
|
||
ETCD_NAME: etcd-03
|
||
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.13:2380
|
||
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
|
||
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.13:2379
|
||
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
|
||
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
|
||
ETCD_INITIAL_CLUSTER_STATE: new
|
||
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/etcd-03/data:/bitnami/etcd/data
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 2379
|
||
published: 2379
|
||
protocol: tcp
|
||
mode: host
|
||
- target: 2380
|
||
published: 2380
|
||
protocol: tcp
|
||
mode: host
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-03
|
||
restart_policy:
|
||
condition: on-failure
|
||
```
|
||
|
||
**APISIX etcd usage:** In prod, APISIX shares this etcd cluster with the `/apisix` prefix. Patroni uses the `/service/` prefix and APISIX uses the `/apisix/` prefix, so there is no collision. APISIX configuration is managed by the `config.yaml` file in the `docker-stack-infra.prod.yml` overlay; the connection is made to `http://iklim-db-01:2379,http://iklim-db-02:2379,http://iklim-db-03:2379`. Therefore, the app subnet -> DB nodes port 2379 firewall rule is mandatory; it was added in Section 1.
|
||
|
||
**Important:** `ETCD_INITIAL_CLUSTER_STATE` must be `new` on the first deploy and `existing` on all later deploys. If the wrong value is left in place, the data directory is reset. The deploy steps in Section 6 below detect this automatically; no manual update is required.
|
||
|
||
### 5.3 Patroni Configuration
|
||
|
||
A separate `patroni.yml` file is created for each node. The only differences are the `name` and `connect_address` fields.
|
||
|
||
**Node 01** — `/mnt/storagebox/prod/db/postgresql-01/config/patroni.yml`:
|
||
|
||
```yaml
|
||
scope: iklim-postgres
|
||
namespace: /db/
|
||
name: postgresql-01
|
||
|
||
restapi:
|
||
listen: 0.0.0.0:8008
|
||
connect_address: 10.20.20.11:8008
|
||
|
||
etcd3:
|
||
hosts:
|
||
- 10.20.20.11:2379
|
||
- 10.20.20.12:2379
|
||
- 10.20.20.13:2379
|
||
|
||
bootstrap:
|
||
dcs:
|
||
ttl: 30
|
||
loop_wait: 10
|
||
retry_timeout: 10
|
||
maximum_lag_on_failover: 1048576
|
||
postgresql:
|
||
use_pg_rewind: true
|
||
parameters:
|
||
wal_level: replica
|
||
hot_standby: "on"
|
||
wal_keep_size: 512
|
||
max_wal_senders: 5
|
||
max_replication_slots: 5
|
||
shared_preload_libraries: 'pg_stat_statements'
|
||
pg_stat_statements.track: 'all'
|
||
|
||
initdb:
|
||
- encoding: UTF8
|
||
- data-checksums
|
||
|
||
pg_hba:
|
||
- host replication replicator 10.20.20.0/24 scram-sha-256
|
||
- host all all 10.20.10.0/24 scram-sha-256
|
||
- host all all 10.20.20.0/24 scram-sha-256
|
||
|
||
users:
|
||
postgres:
|
||
password: "${POSTGRES_PASSWORD}"
|
||
options:
|
||
- superuser
|
||
|
||
postgresql:
|
||
listen: 0.0.0.0:5432
|
||
connect_address: 10.20.20.11:5432
|
||
data_dir: /var/lib/postgresql/data/pgdata
|
||
pgpass: /tmp/pgpass0
|
||
authentication:
|
||
replication:
|
||
username: replicator
|
||
password: "${REPLICATOR_PASSWORD}"
|
||
superuser:
|
||
username: postgres
|
||
password: "${POSTGRES_PASSWORD}"
|
||
parameters:
|
||
unix_socket_directories: "/var/run/postgresql"
|
||
|
||
tags:
|
||
nofailover: false
|
||
noloadbalance: false
|
||
clonefrom: false
|
||
nosync: false
|
||
```
|
||
|
||
**Node 02** — `/mnt/storagebox/prod/db/postgresql-02/config/patroni.yml`:
|
||
|
||
Same content as Node 01; only the following fields differ:
|
||
|
||
```yaml
|
||
name: postgresql-02
|
||
|
||
restapi:
|
||
connect_address: 10.20.20.12:8008
|
||
|
||
postgresql:
|
||
connect_address: 10.20.20.12:5432
|
||
data_dir: /var/lib/postgresql/data/pgdata
|
||
```
|
||
|
||
**Node 03** — `/mnt/storagebox/prod/db/postgresql-03/config/patroni.yml`:
|
||
|
||
```yaml
|
||
name: postgresql-03
|
||
|
||
restapi:
|
||
connect_address: 10.20.20.13:8008
|
||
|
||
postgresql:
|
||
connect_address: 10.20.20.13:5432
|
||
data_dir: /var/lib/postgresql/data/pgdata
|
||
```
|
||
|
||
### 5.4 Stack File — Patroni
|
||
|
||
`/opt/iklimco/stacks/prod-db-patroni.yml`:
|
||
|
||
```yaml
|
||
version: "3.8"
|
||
|
||
networks:
|
||
iklimco-net:
|
||
external: true
|
||
|
||
services:
|
||
patroni-01:
|
||
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
|
||
environment:
|
||
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
|
||
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
|
||
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
|
||
TZ: "Europe/Istanbul"
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/postgresql-01/data:/var/lib/postgresql/data
|
||
- /mnt/storagebox/prod/db/postgresql-01/config/patroni.yml:/etc/patroni/patroni.yml:ro
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 5432
|
||
published: 5432
|
||
protocol: tcp
|
||
mode: host
|
||
- target: 8008
|
||
published: 8008
|
||
protocol: tcp
|
||
mode: host
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-01
|
||
restart_policy:
|
||
condition: on-failure
|
||
|
||
patroni-02:
|
||
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
|
||
environment:
|
||
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
|
||
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
|
||
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
|
||
TZ: "Europe/Istanbul"
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/postgresql-02/data:/var/lib/postgresql/data
|
||
- /mnt/storagebox/prod/db/postgresql-02/config/patroni.yml:/etc/patroni/patroni.yml:ro
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 5432
|
||
published: 5432
|
||
protocol: tcp
|
||
mode: host
|
||
- target: 8008
|
||
published: 8008
|
||
protocol: tcp
|
||
mode: host
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-02
|
||
restart_policy:
|
||
condition: on-failure
|
||
|
||
patroni-03:
|
||
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
|
||
environment:
|
||
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
|
||
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
|
||
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
|
||
TZ: "Europe/Istanbul"
|
||
volumes:
|
||
- /mnt/storagebox/prod/db/postgresql-03/data:/var/lib/postgresql/data
|
||
- /mnt/storagebox/prod/db/postgresql-03/config/patroni.yml:/etc/patroni/patroni.yml:ro
|
||
networks:
|
||
- iklimco-net
|
||
ports:
|
||
- target: 5432
|
||
published: 5432
|
||
protocol: tcp
|
||
mode: host
|
||
- target: 8008
|
||
published: 8008
|
||
protocol: tcp
|
||
mode: host
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
constraints:
|
||
- node.hostname == iklim-db-03
|
||
restart_policy:
|
||
condition: on-failure
|
||
```
|
||
|
||
### 5.5 Status Check
|
||
|
||
```bash
|
||
# On any DB node:
|
||
docker exec -it $(docker ps -q -f name=iklim-patroni_patroni-01) \
|
||
patronictl -c /etc/patroni/patroni.yml list
|
||
```
|
||
|
||
Expected output: one `Leader` row and two `Replica` rows, all with the `State` column set to `running`.
|
||
|
||
```bash
|
||
# etcd cluster health:
|
||
docker exec -it $(docker ps -q -f name=iklim-etcd_etcd-01) \
|
||
etcdctl endpoint health \
|
||
--endpoints=http://10.20.20.11:2379,http://10.20.20.12:2379,http://10.20.20.13:2379
|
||
```
|
||
|
||
```bash
|
||
# Find the current primary:
|
||
docker exec -it $(docker ps -q -f name=iklim-patroni_patroni-01) \
|
||
patronictl -c /etc/patroni/patroni.yml topology
|
||
```
|
||
|
||
## 6. Deploy
|
||
|
||
Order matters: etcd first, then the MongoDB and Patroni stacks.
|
||
|
||
### .env File
|
||
|
||
The `/opt/iklimco/stacks/.env` file is stored on StorageBox as `prod/secrets/iklim.co/.env.stacks`. When it is created the first time, it is filled with strong passwords and uploaded to StorageBox; later deploys fetch it from there:
|
||
|
||
```bash
|
||
# On iklim-app-01, once:
|
||
scp -P 23 STORAGEBOX_USER@STORAGEBOX_USER.your-storagebox.de:prod/secrets/iklim.co/.env.stacks \
|
||
/opt/iklimco/stacks/.env
|
||
chmod 600 /opt/iklimco/stacks/.env
|
||
```
|
||
|
||
File content (`/opt/iklimco/stacks/.env`, not committed to the repo):
|
||
|
||
```env
|
||
DATABASE_POSTGRES_ROOT_USER=postgres
|
||
POSTGRES_PASSWORD=<strong-password>
|
||
REPLICATOR_PASSWORD=<strong-password>
|
||
MONGO_ROOT_PASSWORD=<strong-password>
|
||
```
|
||
|
||
### Deploy Steps
|
||
|
||
```bash
|
||
# On iklim-app-01 (Swarm manager):
|
||
export $(cat /opt/iklimco/stacks/.env | xargs)
|
||
|
||
# Automatic ETCD_INITIAL_CLUSTER_STATE detection — 'new' on first deploy, 'existing' afterwards
|
||
ETCD_STATE="new"
|
||
if docker service ls --filter name=iklim-etcd -q 2>/dev/null | grep -q .; then
|
||
echo "ℹ️ etcd services exist, using 'existing' state..."
|
||
ETCD_STATE="existing"
|
||
else
|
||
echo "ℹ️ First deploy, using 'new' state..."
|
||
fi
|
||
sed -i \
|
||
"s/ETCD_INITIAL_CLUSTER_STATE: new/ETCD_INITIAL_CLUSTER_STATE: ${ETCD_STATE}/g; \
|
||
s/ETCD_INITIAL_CLUSTER_STATE: existing/ETCD_INITIAL_CLUSTER_STATE: ${ETCD_STATE}/g" \
|
||
/opt/iklimco/stacks/prod-db-etcd.yml
|
||
echo "✅ ETCD_INITIAL_CLUSTER_STATE=${ETCD_STATE}"
|
||
|
||
# 1. etcd cluster:
|
||
docker stack deploy \
|
||
--compose-file /opt/iklimco/stacks/prod-db-etcd.yml \
|
||
--with-registry-auth \
|
||
iklim-etcd
|
||
|
||
# Wait for the etcd cluster to be ready:
|
||
echo "⏳ etcd bekleniyor..."
|
||
for i in $(seq 1 18); do
|
||
if docker exec $(docker ps -q -f name=iklim-etcd_etcd-01 | head -1) \
|
||
etcdctl endpoint health \
|
||
--endpoints=http://10.20.20.11:2379,http://10.20.20.12:2379,http://10.20.20.13:2379 \
|
||
2>/dev/null | grep -q "is healthy"; then
|
||
echo "✅ etcd ready"
|
||
break
|
||
fi
|
||
[ "$i" -eq 18 ] && echo "❌ etcd timeout" && exit 1
|
||
echo " attempt $i/18 — 10s bekleniyor..."
|
||
sleep 10
|
||
done
|
||
|
||
# 2. MongoDB:
|
||
docker stack deploy \
|
||
--compose-file /opt/iklimco/stacks/prod-db-mongo.yml \
|
||
--with-registry-auth \
|
||
iklim-db
|
||
|
||
# 3. Patroni (PostgreSQL):
|
||
docker stack deploy \
|
||
--compose-file /opt/iklimco/stacks/prod-db-patroni.yml \
|
||
--with-registry-auth \
|
||
iklim-patroni
|
||
|
||
docker stack services iklim-etcd
|
||
docker stack services iklim-db
|
||
docker stack services iklim-patroni
|
||
```
|
||
|
||
### MongoDB Replica Set Initialization
|
||
|
||
Run once after the MongoDB stack is deployed:
|
||
|
||
```bash
|
||
docker exec -it $(docker ps -q -f name=iklim-db_mongodb-01) mongosh \
|
||
-u mongo-root -p "${MONGO_ROOT_PASSWORD}" --authenticationDatabase admin
|
||
|
||
# Inside mongosh:
|
||
rs.initiate({
|
||
_id: "rs0",
|
||
members: [
|
||
{ _id: 0, host: "10.20.20.11:27017", priority: 2 },
|
||
{ _id: 1, host: "10.20.20.12:27017", priority: 1 },
|
||
{ _id: 2, host: "10.20.20.13:27017", priority: 1 }
|
||
]
|
||
})
|
||
```
|
||
|
||
## 7. Access from App Services
|
||
|
||
App containers connect to DB services through the `iklimco-net` overlay network **by Swarm DNS name**. Because the MongoDB stack (`iklim-db`) and Patroni stack (`iklim-patroni`) share the `iklimco-net` external network, service names are resolved through overlay DNS.
|
||
|
||
### MongoDB Replica Set Connection String
|
||
|
||
Variables in `env-prod/.env`:
|
||
|
||
```bash
|
||
DATABASE_MONGODB_HOST=mongodb-01:27017,mongodb-02:27017,mongodb-03:27017
|
||
DATABASE_MONGODB_PARAMS=replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin
|
||
```
|
||
|
||
Microservice URI through overlay DNS:
|
||
```
|
||
mongodb://<user>:<password>@mongodb-01:27017,mongodb-02:27017,mongodb-03:27017/<db>?replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin
|
||
```
|
||
|
||
> For direct testing, from outside the overlay with private IP:
|
||
> `mongodb://mongo-root:<PASSWORD>@10.20.20.11:27017,10.20.20.12:27017,10.20.20.13:27017/admin?replicaSet=rs0&authSource=admin`
|
||
|
||
### PostgreSQL — Patroni
|
||
|
||
Variables in `env-prod/.env`:
|
||
|
||
```bash
|
||
DATABASE_POSTGRES_HOST=patroni-01:5432,patroni-02:5432,patroni-03:5432
|
||
DATABASE_POSTGRES_PARAMS=targetServerType=preferSecondary&loadBalanceHosts=true
|
||
```
|
||
|
||
Patroni manages whichever node is primary at any moment. The JDBC/libpq driver automatically selects primary/secondary through the `targetServerType` parameter in the multi-host list:
|
||
|
||
```
|
||
# Write — goes to primary (libpq URI):
|
||
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=primary
|
||
|
||
# Read (load balancing):
|
||
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=preferSecondary&loadBalanceHosts=true
|
||
```
|
||
|
||
> For direct testing, from outside the overlay with private IP:
|
||
> `postgresql://postgres@10.20.20.11:5432,10.20.20.12:5432,10.20.20.13:5432/postgres?targetServerType=primary`
|
||
|
||
The PostgreSQL JDBC/libpq driver connects to all listed nodes with `targetServerType=primary` and automatically finds the primary.
|
||
|
||
### Patroni REST API
|
||
|
||
Patroni exposes an HTTP endpoint on port 8008. This endpoint can be used with HAProxy or a similar load balancer to route to the primary automatically:
|
||
|
||
```bash
|
||
# Primary check (HTTP 200 = primary, HTTP 503 = replica):
|
||
curl -s http://10.20.20.11:8008/primary
|
||
```
|
||
|
||
## 8. Geliştirici ve Ofis Erişimi (Production)
|
||
|
||
Prod cluster yapısında `pg-proxy` veya `mongo-proxy` **kullanılmaz**. Ofis bilgisayarından erişim için doğrudan DB subnet'i hedef alınır.
|
||
|
||
### WireGuard Ayarı
|
||
Ofis bilgisayarındaki `.conf` dosyasında `AllowedIPs` güncellenmelidir:
|
||
`AllowedIPs = 10.8.0.1/32, 10.20.20.0/24`
|
||
|
||
### Bağlantı Parametreleri (Multi-Host)
|
||
Modern veritabanı araçları (DBeaver, Compass vb.) küme farkındalıklı bağlantı kurmalıdır:
|
||
|
||
| Veritabanı | Host Listesi | Port | Kritik Parametre |
|
||
| :--- | :--- | :--- | :--- |
|
||
| **PostgreSQL** | `10.20.20.11, 10.20.20.12, 10.20.20.13` | `5432` | `targetServerType=primary` |
|
||
| **MongoDB** | `10.20.20.11, 10.20.20.12, 10.20.20.13` | `27017` | `replicaSet=rs0` |
|
||
|
||
## Acceptance Criteria
|
||
|
||
- `docker stack services iklim-etcd` — three services `1/1`
|
||
- `docker stack services iklim-db` — three MongoDB services `1/1`
|
||
- `docker stack services iklim-patroni` — three Patroni services `1/1`
|
||
- In the output of `docker service ps iklim-patroni_patroni-01`, `patroni-02`, and `patroni-03`, every task runs on an `iklim-db-*` node through the `role=db` placement constraint.
|
||
- In the output of `docker service ps iklim-db_mongodb-01`, `mongodb-02`, and `mongodb-03`, every task runs on an `iklim-db-*` node.
|
||
- In the output of `docker service ps iklim-etcd_etcd-01`, `etcd-02`, and `etcd-03`, every task runs on an `iklim-db-*` node.
|
||
- `patronictl list` — 1 `Leader`, 2 `Replica`, all `running`
|
||
- `etcdctl endpoint health` — three endpoints `healthy`
|
||
- `rs.status()` — 1 PRIMARY, 2 SECONDARY
|
||
- MongoDB and PostgreSQL are reachable from app nodes.
|
||
- Ports `5432`, `27017`, `2379`, `2380`, and `8008` are closed from the public internet.
|
||
- When a DB node is restarted, Patroni performs automatic election and a new primary is selected.
|
||
- During Patroni primary transition, the old primary rejoins as standby; there is no split-brain.
|