- Anglicized setup and facts markdown file names for better consistency. - Updated 01-swarm-init-multinode.md to highlight Ansible automation of Swarm initialization and labeling. - Overhauled 03-infra-stack-changes.md to describe the single monolithic file strategy and reflect current Redis, RabbitMQ, and etcd cluster configurations. - Fixed minor overrides and typos in Patroni templates and Ansible bootstrap documents. - Restructured README and roadmap mapping to align with the renamed setup documents.
685 lines
22 KiB
Markdown
685 lines
22 KiB
Markdown
# 08 - Prod DB Cluster Setup (Swarm)
|
||
|
||
The purpose of this phase is to add the three DB nodes to Docker Swarm as workers and configure the MongoDB replica set and the PostgreSQL high-availability setup managed with Patroni + etcd.
|
||
|
||
`07-prod-ansible-bootstrap.md` must be completed on all DB nodes.
|
||
|
||
## Architecture
|
||
|
||
```
|
||
iklim-app-01/02/03 (Swarm manager'lar, 10.20.10.11/12/13)
|
||
|
|
||
|-- iklimco-net (overlay)
|
||
|
|
||
iklim-db-01 (Swarm worker, 10.20.20.11)
|
||
mongodb-01 [rs0 member 0 — preferred primary]
|
||
etcd-01 [etcd cluster member]
|
||
patroni-01 [Patroni + PostgreSQL — first primary candidate]
|
||
|
||
iklim-db-02 (Swarm worker, 10.20.20.12)
|
||
mongodb-02 [rs0 member 1]
|
||
etcd-02 [etcd cluster member]
|
||
patroni-02 [Patroni + PostgreSQL — standby]
|
||
|
||
iklim-db-03 (Swarm worker, 10.20.20.13)
|
||
mongodb-03 [rs0 member 2]
|
||
etcd-03 [etcd cluster member]
|
||
patroni-03 [Patroni + PostgreSQL — standby]
|
||
```
|
||
|
||
DB containers discover each other through **overlay DNS aliases** (`mongodb-01`, `etcd-01`, `patroni-01`, etc.) on the shared `iklimco-net` overlay network. Patroni/PostgreSQL, MongoDB, and etcd are the DB/cluster services covered by this document; they publish their cluster ports in `host` mode so replication traffic goes directly through the Hetzner private network while overlay DNS resolves service names correctly.
|
||
|
||
The current prod DB services are defined in the root `docker-stack-infra_db-prod.yml` stack file. That stack also contains non-DB infrastructure services such as Redis, Redis Sentinel, and RabbitMQ. Those services are intentionally different: they run on `node.labels.type == service` app/service nodes, do not publish host-mode ports in this stack, and communicate through the `iklimco-net` overlay network only. Do not generalize the DB host-mode rule to Redis or RabbitMQ.
|
||
|
||
## 1. Firewall Update
|
||
|
||
Verify that the following rules exist in `terraform/hetzner/prod/firewall.tf`; if any are missing, add them and run `terraform apply`.
|
||
|
||
Inside `hcloud_firewall.app`, from the DB subnet to Swarm ports:
|
||
|
||
```hcl
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2377"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Docker Swarm control plane from DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "7946"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Docker Swarm node discovery (TCP) from DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "udp"
|
||
port = "7946"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Docker Swarm node discovery (UDP) from DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "udp"
|
||
port = "4789"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Docker Swarm VXLAN overlay from DB subnet"
|
||
}
|
||
```
|
||
|
||
Inside `hcloud_firewall.db`, from the app subnet to Swarm ports + overlay, and etcd/Patroni traffic inside the DB subnet:
|
||
|
||
```hcl
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2377"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "Docker Swarm control plane from app subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "7946"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "Docker Swarm node discovery (TCP) from app subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "udp"
|
||
port = "7946"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "Docker Swarm node discovery (UDP) from app subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "udp"
|
||
port = "4789"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "Docker Swarm VXLAN overlay from app subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2379"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "etcd client port within DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2379"
|
||
source_ips = [local.app_subnet_cidr]
|
||
description = "etcd client port from app subnet (APISIX connects to Patroni etcd)"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "2380"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "etcd peer port within DB subnet"
|
||
}
|
||
|
||
rule {
|
||
direction = "in"
|
||
protocol = "tcp"
|
||
port = "8008"
|
||
source_ips = [local.db_subnet_cidr]
|
||
description = "Patroni REST API within DB subnet"
|
||
}
|
||
```
|
||
|
||
```bash
|
||
cd terraform/hetzner/prod
|
||
terraform plan
|
||
terraform apply
|
||
```
|
||
|
||
## 2. Add DB Nodes to Swarm
|
||
|
||
This is handled by `Environment_Infrastructure/ansible/prod/prod-bootstrap.yml` through the `swarm` role. The role initializes Swarm on `iklim-app-01`, joins `iklim-app-02/03` as managers, joins `iklim-db-01/02/03` as workers, and labels DB nodes.
|
||
|
||
Manual equivalent, kept for troubleshooting only:
|
||
|
||
**Swarm manager'lardan birinde** (iklim-app-01) join token al:
|
||
|
||
```bash
|
||
docker swarm join-token worker
|
||
```
|
||
|
||
**Her DB node'unda** (iklim-db-01, iklim-db-02, iklim-db-03):
|
||
|
||
```bash
|
||
docker swarm join --token <TOKEN> 10.20.10.11:2377
|
||
```
|
||
|
||
Label the nodes **on iklim-app-01**. In automation this is split into two phases:
|
||
|
||
- the shared `swarm` role adds `role=db` to DB nodes;
|
||
- the prod-specific `prod-bootstrap.yml` play adds `db-index=01/02/03`.
|
||
|
||
Manual equivalent:
|
||
|
||
```bash
|
||
docker node update --label-add role=db iklim-db-01
|
||
docker node update --label-add role=db iklim-db-02
|
||
docker node update --label-add role=db iklim-db-03
|
||
|
||
docker node update --label-add db-index=01 iklim-db-01
|
||
docker node update --label-add db-index=02 iklim-db-02
|
||
docker node update --label-add db-index=03 iklim-db-03
|
||
|
||
docker node ls
|
||
```
|
||
|
||
## 3. StorageBox Directory Structure
|
||
|
||
DB data is stored on local DB-node paths prepared by Ansible:
|
||
|
||
```text
|
||
/opt/iklimco/db/mongodb
|
||
/opt/iklimco/db/postgresql
|
||
```
|
||
|
||
Configuration files are placed on StorageBox. On each DB node, where `/mnt/storagebox` must already be mounted:
|
||
|
||
```bash
|
||
# On iklim-db-01:
|
||
mkdir -p /mnt/storagebox/db/mongodb-01/config
|
||
mkdir -p /mnt/storagebox/db/postgresql-01/config
|
||
|
||
# On iklim-db-02:
|
||
mkdir -p /mnt/storagebox/db/mongodb-02/config
|
||
mkdir -p /mnt/storagebox/db/postgresql-02/config
|
||
|
||
# On iklim-db-03:
|
||
mkdir -p /mnt/storagebox/db/mongodb-03/config
|
||
mkdir -p /mnt/storagebox/db/postgresql-03/config
|
||
```
|
||
|
||
Config files (`mongod.conf`, `patroni.yml`) and the MongoDB replica set key are deployed by the Ansible `db_stack` role into these directories. etcd uses Docker named volumes (`etcd-01-data`, `etcd-02-data`, `etcd-03-data`) from `docker-stack-infra_db-prod.yml`.
|
||
|
||
## 4. MongoDB Replica Set
|
||
|
||
### mongod.conf
|
||
|
||
Her DB node'unda `/mnt/storagebox/db/mongodb-0X/config/mongod.conf` (Ansible `db_stack` rolü tarafından deploy edilir):
|
||
|
||
```yaml
|
||
net:
|
||
port: 27017
|
||
storage:
|
||
engine: "wiredTiger"
|
||
dbPath: "/data/db"
|
||
directoryPerDB: true
|
||
systemLog:
|
||
verbosity: 0
|
||
timeStampFormat: "iso8601-local"
|
||
destination: file
|
||
path: "/data/log/mongo.log"
|
||
logAppend: true
|
||
logRotate: rename
|
||
replication:
|
||
replSetName: "rs0"
|
||
security:
|
||
authorization: enabled
|
||
keyFile: "/data/configdb/rs-auth.key"
|
||
```
|
||
|
||
### Replica Set Auth Key
|
||
|
||
The **same** key file must exist on all DB nodes. In the current production setup, this is automated by `ansible/prod/roles/db_stack/tasks/db_node.yml`:
|
||
|
||
- `iklim-db-01` generates `/mnt/storagebox/db/mongodb-01/config/rs-auth.key` if it is missing.
|
||
- the same key content is copied to `/mnt/storagebox/db/mongodb-02/config/rs-auth.key` and `/mnt/storagebox/db/mongodb-03/config/rs-auth.key`;
|
||
- permissions are set to `0400`.
|
||
|
||
Manual recovery equivalent, kept only for troubleshooting:
|
||
|
||
```bash
|
||
openssl rand -base64 756 > /mnt/storagebox/db/mongodb-01/config/rs-auth.key
|
||
chmod 400 /mnt/storagebox/db/mongodb-01/config/rs-auth.key
|
||
|
||
cat /mnt/storagebox/db/mongodb-01/config/rs-auth.key \
|
||
> /mnt/storagebox/db/mongodb-02/config/rs-auth.key
|
||
cat /mnt/storagebox/db/mongodb-01/config/rs-auth.key \
|
||
> /mnt/storagebox/db/mongodb-03/config/rs-auth.key
|
||
|
||
chmod 400 /mnt/storagebox/db/mongodb-0{2,3}/config/rs-auth.key
|
||
```
|
||
|
||
### Stack File — MongoDB
|
||
|
||
MongoDB services are defined in `docker-stack-infra_db-prod.yml` (repo root). Each service uses a local DB-node bind mount for data and a StorageBox bind mount for config:
|
||
|
||
```yaml
|
||
mongodb-01:
|
||
image: ${IMAGE_MONGODB}
|
||
environment:
|
||
MONGO_INITDB_ROOT_USERNAME: "${DATABASE_MONGODB_ROOT_USER}"
|
||
MONGO_INITDB_ROOT_PASSWORD: "${DATABASE_MONGODB_ROOT_PASSWD}"
|
||
volumes:
|
||
- /opt/iklimco/db/mongodb:/data/db
|
||
- /mnt/storagebox/db/mongodb-01/config:/data/configdb
|
||
networks:
|
||
iklimco-net:
|
||
aliases:
|
||
- mongodb-01
|
||
ports:
|
||
- target: 27017
|
||
published: 27017
|
||
protocol: tcp
|
||
mode: host
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
max_replicas_per_node: 1
|
||
constraints:
|
||
- node.hostname == iklim-db-01
|
||
```
|
||
|
||
The same pattern is repeated for `mongodb-02` and `mongodb-03`, with node-specific StorageBox config paths and placement constraints.
|
||
|
||
### Replica Set Initialization
|
||
|
||
Replica set initialization is handled by the root prod workflow step `Initialize MongoDB Replica Set`. The workflow:
|
||
|
||
1. Connects to the first host from `DATABASE_MONGODB_HOST`.
|
||
2. Runs `rs.initiate()` if the replica set is uninitialized.
|
||
3. Checks current members if the replica set already exists.
|
||
4. Runs `rs.add()` through the primary if hosts from `DATABASE_MONGODB_HOST` are missing.
|
||
|
||
Manual equivalent, kept for troubleshooting only:
|
||
|
||
```bash
|
||
# On iklim-app-01 (overlay network erişimi için):
|
||
docker run --rm -it --network iklimco-net mongo:8.3.2 \
|
||
mongosh "mongodb://mongo-root:${DATABASE_MONGODB_ROOT_PASSWD}@mongodb-01/admin"
|
||
|
||
# Inside mongosh:
|
||
rs.initiate({
|
||
_id: "rs0",
|
||
members: [
|
||
{ _id: 0, host: "mongodb-01:27017", priority: 2 },
|
||
{ _id: 1, host: "mongodb-02:27017", priority: 1 },
|
||
{ _id: 2, host: "mongodb-03:27017", priority: 1 }
|
||
]
|
||
})
|
||
|
||
# Status check:
|
||
rs.status()
|
||
```
|
||
|
||
The replica set is ready when `"stateStr": "PRIMARY"` and two `"SECONDARY"` entries are visible.
|
||
|
||
## 5. PostgreSQL — Patroni + etcd
|
||
|
||
Patroni coordinates PostgreSQL primary/standby roles through etcd. If the primary goes down, one of the other nodes automatically wins the election and becomes primary. The Swarm service restarts the container; Patroni continues from where it left off.
|
||
|
||
### 5.1 Custom Image (Patroni + PostGIS)
|
||
|
||
Patroni is installed on top of the `postgis/postgis:18-3.6` image. This image is pushed to Harbor and used in `docker-stack-infra_db-prod.yml` via `${CUSTOM_IMAGE_REGISTRY}${IMAGE_PATRONI}`.
|
||
|
||
`build/patroni-postgis/Dockerfile`:
|
||
|
||
```dockerfile
|
||
FROM postgis/postgis:18-3.6
|
||
|
||
USER root
|
||
|
||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||
python3-pip \
|
||
python3-dev \
|
||
gcc \
|
||
libpq-dev \
|
||
&& pip3 install --no-cache-dir 'patroni[etcd3]' \
|
||
&& apt-get purge -y gcc python3-dev \
|
||
&& apt-get autoremove -y \
|
||
&& rm -rf /var/lib/apt/lists/*
|
||
|
||
USER postgres
|
||
|
||
ENTRYPOINT ["patroni", "/etc/patroni/patroni.yml"]
|
||
```
|
||
|
||
Build and push is done with `ops/push-harbor-custom-images.sh`:
|
||
|
||
```bash
|
||
cd /path/to/repo
|
||
bash ops/push-harbor-custom-images.sh
|
||
```
|
||
|
||
Or manually:
|
||
|
||
```bash
|
||
cd build/patroni-postgis
|
||
docker build -t registry.tarla.io/iklimco/custom-patroni-postgis:18-3.6 .
|
||
echo "$HARBOR_CI_TOKEN" | docker login registry.tarla.io -u robot-ci-push-iklimco --password-stdin
|
||
docker push registry.tarla.io/iklimco/custom-patroni-postgis:18-3.6
|
||
```
|
||
|
||
### 5.2 etcd Cluster
|
||
|
||
etcd services are defined in `docker-stack-infra_db-prod.yml`. Each service uses a named Docker volume for data and has an overlay DNS alias. Environment variables reference peer URLs by alias, not by hardcoded IP:
|
||
|
||
```yaml
|
||
etcd-01:
|
||
image: ${IMAGE_ETCD}
|
||
environment:
|
||
ALLOW_NONE_AUTHENTICATION: "no"
|
||
ETCD_NAME: etcd-01
|
||
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://etcd-01:2380
|
||
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
|
||
ETCD_ADVERTISE_CLIENT_URLS: http://etcd-01:2379
|
||
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
|
||
ETCD_INITIAL_CLUSTER: "etcd-01=http://etcd-01:2380,etcd-02=http://etcd-02:2380,etcd-03=http://etcd-03:2380"
|
||
ETCD_INITIAL_CLUSTER_STATE: new
|
||
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
|
||
ETCD_ROOT_PASSWORD: "${ETCD_ROOT_PASSWORD}"
|
||
volumes:
|
||
- etcd-01-data:/bitnami/etcd/data
|
||
networks:
|
||
iklimco-net:
|
||
aliases:
|
||
- etcd-01
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
max_replicas_per_node: 1
|
||
constraints:
|
||
- node.hostname == iklim-db-01
|
||
```
|
||
|
||
**APISIX etcd usage:** In prod, APISIX shares this etcd cluster with the `/apisix` prefix. Patroni uses the `/service/` prefix and APISIX uses the `/apisix/` prefix — no collision. The overlay DNS names (`etcd-01:2379`, `etcd-02:2379`, `etcd-03:2379`) are reachable from app nodes via the `iklimco-net` overlay. Therefore, the app subnet → DB nodes port 2379 firewall rule is mandatory; it was added in Section 1.
|
||
|
||
**Important:** `ETCD_INITIAL_CLUSTER_STATE` is currently defined in `docker-stack-infra_db-prod.yml`. When changing etcd cluster membership, do not blindly expand `ETCD_INITIAL_CLUSTER` on a running cluster; add members through etcd membership operations first.
|
||
|
||
### 5.3 Patroni Configuration
|
||
|
||
`patroni.yml` is generated per-node by the Ansible `db_stack` role from `templates/patroni.yml.j2` using `inventory_hostname` (e.g., `iklim-db-01`). The generated file uses overlay DNS aliases for all addresses.
|
||
|
||
**Generated output — Node 01** (`/mnt/storagebox/db/postgresql-01/config/patroni.yml`):
|
||
|
||
```yaml
|
||
scope: iklim-postgres
|
||
namespace: /db/
|
||
name: postgresql-01
|
||
|
||
restapi:
|
||
listen: 0.0.0.0:8008
|
||
connect_address: patroni-01:8008
|
||
|
||
etcd3:
|
||
hosts:
|
||
- etcd-01:2379
|
||
- etcd-02:2379
|
||
- etcd-03:2379
|
||
|
||
bootstrap:
|
||
dcs:
|
||
ttl: 30
|
||
loop_wait: 10
|
||
retry_timeout: 10
|
||
maximum_lag_on_failover: 1048576
|
||
postgresql:
|
||
use_pg_rewind: true
|
||
parameters:
|
||
wal_level: replica
|
||
hot_standby: "on"
|
||
wal_keep_size: 512
|
||
max_wal_senders: 5
|
||
max_replication_slots: 5
|
||
shared_preload_libraries: 'pg_stat_statements'
|
||
pg_stat_statements.track: 'all'
|
||
|
||
initdb:
|
||
- encoding: UTF8
|
||
- data-checksums
|
||
|
||
pg_hba:
|
||
- host replication replicator 10.20.20.0/24 scram-sha-256
|
||
- host all all 10.20.10.0/24 scram-sha-256
|
||
- host all all 10.20.20.0/24 scram-sha-256
|
||
|
||
users:
|
||
postgres:
|
||
password: "${DATABASE_POSTGRES_ROOT_PASSWD}"
|
||
options:
|
||
- superuser
|
||
|
||
postgresql:
|
||
listen: 0.0.0.0:5432
|
||
connect_address: patroni-01:5432
|
||
data_dir: /var/lib/postgresql/data/pgdata
|
||
pgpass: /tmp/pgpass0
|
||
authentication:
|
||
replication:
|
||
username: replicator
|
||
password: "${DATABASE_POSTGRES_REPLICATOR_PASSWORD}"
|
||
superuser:
|
||
username: postgres
|
||
password: "${DATABASE_POSTGRES_ROOT_PASSWD}"
|
||
parameters:
|
||
unix_socket_directories: "/var/run/postgresql"
|
||
|
||
tags:
|
||
nofailover: false
|
||
noloadbalance: false
|
||
clonefrom: false
|
||
nosync: false
|
||
```
|
||
|
||
For Node 02 and 03, only `name`, `restapi.connect_address`, and `postgresql.connect_address` differ (`postgresql-02`/`patroni-02:8008`/`patroni-02:5432`, etc.).
|
||
|
||
### 5.4 Stack File — Patroni
|
||
|
||
Patroni services are defined in `docker-stack-infra_db-prod.yml`. Each service uses the custom image, a local DB-node bind mount for data, a StorageBox bind mount for the config file, and overlay DNS aliases:
|
||
|
||
```yaml
|
||
patroni-01:
|
||
image: ${CUSTOM_IMAGE_REGISTRY}${IMAGE_PATRONI}
|
||
environment:
|
||
POSTGRES_USER: "${DATABASE_POSTGRES_ROOT_USER}"
|
||
POSTGRES_PASSWORD: "${DATABASE_POSTGRES_ROOT_PASSWD}"
|
||
REPLICATOR_PASSWORD: "${DATABASE_POSTGRES_REPLICATOR_PASSWORD}"
|
||
ETCD_ROOT_PASSWORD: "${ETCD_ROOT_PASSWORD}"
|
||
TZ: "Europe/Istanbul"
|
||
volumes:
|
||
- /opt/iklimco/db/postgresql:/var/lib/postgresql/data
|
||
- /mnt/storagebox/db/postgresql-01/config/patroni.yml:/etc/patroni/patroni.yml:ro
|
||
networks:
|
||
iklimco-net:
|
||
aliases:
|
||
- patroni-01
|
||
ports:
|
||
- target: 5432
|
||
published: 5432
|
||
protocol: tcp
|
||
mode: host
|
||
- target: 8008
|
||
published: 8008
|
||
protocol: tcp
|
||
mode: host
|
||
deploy:
|
||
replicas: 1
|
||
placement:
|
||
max_replicas_per_node: 1
|
||
constraints:
|
||
- node.hostname == iklim-db-01
|
||
```
|
||
|
||
The same pattern is repeated for `patroni-02` and `patroni-03`, with node-specific StorageBox config paths and placement constraints.
|
||
|
||
### 5.5 Status Check
|
||
|
||
```bash
|
||
# On iklim-app-01 — Patroni cluster status:
|
||
docker exec -it $(docker ps -q -f name=iklimco_patroni-01 | head -1) \
|
||
patronictl -c /etc/patroni/patroni.yml list
|
||
```
|
||
|
||
Expected output: one `Leader` row and two `Replica` rows, all with the `State` column set to `running`.
|
||
|
||
```bash
|
||
# etcd cluster health (from app node via overlay):
|
||
docker run --rm --network iklimco-net alpine \
|
||
sh -c "wget -qO- http://etcd-01:2379/health && \
|
||
wget -qO- http://etcd-02:2379/health && \
|
||
wget -qO- http://etcd-03:2379/health"
|
||
```
|
||
|
||
```bash
|
||
# Find the current primary:
|
||
docker exec -it $(docker ps -q -f name=iklimco_patroni-01 | head -1) \
|
||
patronictl -c /etc/patroni/patroni.yml topology
|
||
```
|
||
|
||
## 6. Deploy
|
||
|
||
All DB services (etcd, MongoDB, Patroni) are in the current root prod stack `docker-stack-infra_db-prod.yml`. Normal deployment is done by `.gitea/workflows/deploy-prod.yml`, not by running a separate DB stack manually.
|
||
|
||
### .env File
|
||
|
||
DB stack password variables (`DATABASE_POSTGRES_ROOT_PASSWD`, `DATABASE_POSTGRES_REPLICATOR_PASSWORD`, `DATABASE_MONGODB_ROOT_PASSWD`, `ETCD_ROOT_PASSWORD`) are stored in `prod/secrets/iklim.co/.env.secrets.shared` on StorageBox. The workflow fetches this file automatically.
|
||
|
||
```bash
|
||
scp -P 23 STORAGEBOX_USER@STORAGEBOX_USER.your-storagebox.de:prod/secrets/iklim.co/.env.secrets.shared \
|
||
/tmp/.env.secrets.shared
|
||
chmod 600 /tmp/.env.secrets.shared
|
||
```
|
||
|
||
### Deploy Steps
|
||
|
||
The root prod workflow deploys the stack with:
|
||
|
||
```bash
|
||
docker stack deploy \
|
||
--with-registry-auth \
|
||
--resolve-image changed \
|
||
-c docker-stack-infra_db-prod.yml \
|
||
iklimco
|
||
```
|
||
|
||
After the stack deploy, the workflow waits for etcd, initializes APISIX, initializes the MongoDB replica set, and runs PostgreSQL/MongoDB init scripts.
|
||
|
||
### DB Node Placement Check
|
||
|
||
```bash
|
||
docker service ps iklimco_etcd-01
|
||
docker service ps iklimco_mongodb-01
|
||
docker service ps iklimco_patroni-01
|
||
```
|
||
|
||
All tasks must run on the expected `iklim-db-*` nodes.
|
||
|
||
### MongoDB Replica Set Initialization
|
||
|
||
Handled by the workflow. Manual form for troubleshooting:
|
||
|
||
```bash
|
||
# From iklim-app-01 via overlay network:
|
||
docker run --rm -it --network iklimco-net mongo:8.3.2 \
|
||
mongosh "mongodb://mongo-root:${DATABASE_MONGODB_ROOT_PASSWD}@mongodb-01/admin"
|
||
|
||
# Inside mongosh:
|
||
rs.initiate({
|
||
_id: "rs0",
|
||
members: [
|
||
{ _id: 0, host: "mongodb-01:27017", priority: 2 },
|
||
{ _id: 1, host: "mongodb-02:27017", priority: 1 },
|
||
{ _id: 2, host: "mongodb-03:27017", priority: 1 }
|
||
]
|
||
})
|
||
```
|
||
|
||
## 7. Access from App Services
|
||
|
||
App containers connect to DB services through the `iklimco-net` overlay network by **overlay DNS name**. Because the `iklimco` stack shares the `iklimco-net` external network, service names and aliases are resolved through overlay DNS.
|
||
|
||
### MongoDB Replica Set Connection String
|
||
|
||
Variables in StorageBox `prod/secrets/iklim.co/.env`:
|
||
|
||
```bash
|
||
DATABASE_MONGODB_HOST=mongodb-01:27017,mongodb-02:27017,mongodb-03:27017
|
||
DATABASE_MONGODB_PARAMS=replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin
|
||
```
|
||
|
||
Microservice URI through overlay DNS:
|
||
```
|
||
mongodb://<user>:<password>@mongodb-01:27017,mongodb-02:27017,mongodb-03:27017/<db>?replicaSet=rs0&readPreference=secondaryPreferred&authSource=admin
|
||
```
|
||
|
||
> For direct testing, from outside the overlay with private IP:
|
||
> `mongodb://mongo-root:<PASSWORD>@10.20.20.11:27017,10.20.20.12:27017,10.20.20.13:27017/admin?replicaSet=rs0&authSource=admin`
|
||
|
||
### PostgreSQL — Patroni
|
||
|
||
Variables in StorageBox `prod/secrets/iklim.co/.env`:
|
||
|
||
```bash
|
||
DATABASE_POSTGRES_HOST=patroni-01:5432,patroni-02:5432,patroni-03:5432
|
||
DATABASE_POSTGRES_PARAMS=targetServerType=preferSecondary&loadBalanceHosts=true
|
||
```
|
||
|
||
Patroni manages whichever node is primary at any moment. The JDBC/libpq driver automatically selects primary/secondary through the `targetServerType` parameter in the multi-host list:
|
||
|
||
```
|
||
# Write — goes to primary (libpq URI):
|
||
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=primary
|
||
|
||
# Read (load balancing):
|
||
postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetServerType=preferSecondary&loadBalanceHosts=true
|
||
```
|
||
|
||
> For direct testing, from outside the overlay with private IP:
|
||
> `postgresql://postgres@10.20.20.11:5432,10.20.20.12:5432,10.20.20.13:5432/postgres?targetServerType=primary`
|
||
|
||
### Patroni REST API
|
||
|
||
Patroni exposes an HTTP endpoint on port 8008. This endpoint can be used with HAProxy or a similar load balancer to route to the primary automatically:
|
||
|
||
```bash
|
||
# Primary check (HTTP 200 = primary, HTTP 503 = replica):
|
||
curl -s http://patroni-01:8008/primary
|
||
```
|
||
|
||
## 8. Geliştirici ve Ofis Erişimi (Production)
|
||
|
||
Prod cluster yapısında `pg-proxy` veya `mongo-proxy` **kullanılmaz**. Ofis bilgisayarından erişim için doğrudan DB subnet'i hedef alınır.
|
||
|
||
### WireGuard Ayarı
|
||
Ofis bilgisayarındaki `.conf` dosyasında `AllowedIPs` güncellenmelidir: `AllowedIPs = 10.8.0.1/32, 10.20.20.0/24`
|
||
|
||
### Bağlantı Parametreleri (Multi-Host)
|
||
Modern veritabanı araçları (DBeaver, Compass vb.) küme farkındalıklı bağlantı kurmalıdır:
|
||
|
||
| Veritabanı | Host Listesi | Port | Kritik Parametre |
|
||
| :--- | :--- | :--- | :--- |
|
||
| **PostgreSQL** | `10.20.20.11, 10.20.20.12, 10.20.20.13` | `5432` | `targetServerType=primary` |
|
||
| **MongoDB** | `10.20.20.11, 10.20.20.12, 10.20.20.13` | `27017` | `replicaSet=rs0` |
|
||
|
||
## Acceptance Criteria
|
||
|
||
- `docker stack services iklimco` — etcd-01/02/03, mongodb-01/02/03, patroni-01/02/03 are visible and all target replicas are healthy
|
||
- `docker service ps iklimco_patroni-01/02/03` — each task runs on its expected `iklim-db-*` node
|
||
- `docker service ps iklimco_mongodb-01/02/03` — each task runs on its expected `iklim-db-*` node
|
||
- `docker service ps iklimco_etcd-01/02/03` — each task runs on its expected `iklim-db-*` node
|
||
- `patronictl list` — 1 `Leader`, 2 `Replica`, all `running`
|
||
- etcd health endpoint returns `"health":"true"` on all three nodes via overlay
|
||
- `rs.status()` — 1 PRIMARY, 2 SECONDARY
|
||
- MongoDB and PostgreSQL are reachable from app nodes.
|
||
- Ports `5432`, `27017`, `2379`, `2380`, and `8008` are closed from the public internet.
|
||
- When a DB node is restarted, Patroni performs automatic election and a new primary is selected.
|
||
- During Patroni primary transition, the old primary rejoins as standby; there is no split-brain.
|