docs(prod): resolve cross-layer inconsistencies and complete prod env implementation

Ansible roles:
- act_runner/defaults: set act_runner_name to inventory_hostname (was
  hardcoded to iklim-test-app); added vault_gitea_runner_token to vault.yml
- prod/group_vars/all: restructured from flat files to all/ directory;
  added act_runner_labels override (prod-runner,ubuntu-24.04,hostname);
  added storagebox_managed_directories; added swarm_manager_ip and other
  prod-specific vars
- prod/roles/db_stack: prod-specific db_node tasks using StorageBox paths
  (/mnt/storagebox/db/...) instead of local paths
- docker/tasks: split firewalld loop into all-nodes (Swarm ports) and
  app-only (80/443) tasks
- swarm/tasks: added --advertise-addr private_ip to join commands for
  correct multi-homed node advertisement
- hardening/tasks: corrected firewalld drop zone configuration
- node_dirs/tasks: added /opt/iklimco/vault/data for Vault Raft volume
- db_stack/tasks/app_node: updated stale comment (removed pg-proxy reference)
- db_stack/templates: removed pg-proxy and mongo-proxy service blocks
- test/host_vars/iklim-app-01: added act_runner_name override to preserve
  existing test runner registration

Roadmap and setup docs:
- roadmap/03-infra-stack-changes: added replicas:0 for etcd/postgresql/
  mongodb/pg-proxy/mongo-proxy in prod overlay; updated placement table;
  fixed grafana/data mkdir (auto-created by Ansible); translated Turkish
  note to English
- roadmap/08-deploy-pipeline-update: updated stale "remains idle" note
  for standalone etcd (now disabled with replicas:0)
- roadmap/01-swarm-init-multinode: consistency fixes
- setup/06: added Outputs section and etcd firewall port documentation
- setup/07: removed prometheus/data from StorageBox acceptance criteria;
  replaced manual StorageBox mkdir section with Ansible auto-creation note;
  updated prod README section with full bootstrap instructions and vault docs;
  added act_runner_labels prod policy
- setup/08: extensive rewrite — aligned with Patroni etcd overlay DNS,
  corrected hcloud_firewall.app reference, updated all StorageBox paths
  from /prod/db/ to /db/
- setup/09: removed prometheus/data from acceptance criteria; updated
  runner label policy (removed docker/swarm-manager labels); added
  acceptance criterion for disabled services absent from docker service ls

Terraform:
- prod/firewall.tf: added missing DB subnet mutual rules (etcd, Patroni)
- prod/outputs.tf: added prod_floating_ip and prod_private_ips outputs
- prod/servers.tf: aligned placement group and naming
- prod/variables.tf: corrected variable descriptions
- prod/terraform.tfvars.example: updated defaults
- terraform/hetzner/README.md: new comprehensive README covering both
  test and prod environments with firewall tables and inventory instructions

ansible/README.md: expanded prod section with inventory groups, bootstrap
  run order, runner label policy, and vault variable documentation
This commit is contained in:
Murat ÖZDEMİR 2026-05-18 19:17:56 +03:00
parent 8780c7c05e
commit 27f4f83f73
31 changed files with 722 additions and 560 deletions

View File

@ -285,33 +285,105 @@ cd Environment_Infrastructure/terraform/hetzner/prod
terraform output -raw ansible_inventory_yaml > ../../../ansible/prod/inventory/generated/prod.yml
```
### Prod Playbook Planı
### Prod Inventory Grupları
Şu an prod tarafında ana playbook `prod-bootstrap.yml` dosyasıdır.
Mevcut rol/tag kapsamı:
| Tag | Rol | Amaç |
| Grup | Host'lar | Rol |
| --- | --- | --- |
| `base` | `base` | Temel sistem hazırlığı |
| `hardening` | `hardening` | SSH ve güvenlik sıkılaştırmaları |
| `docker` | `docker` | Docker kurulumu |
| `node_dirs` | `node_dirs` | Node dizin hazırlıkları |
| `storagebox` | `storagebox` | StorageBox hazırlıkları |
| `storagebox_ssh_key` | `storagebox_ssh_key` | StorageBox SSH key hazırlığı |
| `swarm` | `swarm` | Prod Swarm kurulumu |
| `db_labels` | inline task | DB node label'ları |
| `db_stack` | `db_stack` | DB node konfigürasyonu |
| `act_runner` | `act_runner` | App node runner kurulumu |
| `app` | `iklim-app-01`, `iklim-app-02`, `iklim-app-03` | Swarm manager + uygulama worker |
| `db` | `iklim-db-01`, `iklim-db-02`, `iklim-db-03` | Swarm worker + DB cluster node'u |
Prod ortamı tamamlandığında bu bölüm aşağıdaki başlıklarla genişletilecektir:
`--limit` örnekleri:
- Prod inventory grupları
- Prod bootstrap çalıştırma sırası
- Prod app node hazırlıkları
- Prod DB node hazırlıkları
- Prod runner kurulumu
- Prod rollback ve tekrar çalıştırma notları
```bash
--limit iklim-app-01
--limit app
--limit db
```
### Prod Playbook ve Tag'ler
Ana playbook: `prod-bootstrap.yml`
| Tag | Rol | Host kapsamı | Amaç |
| --- | --- | --- | --- |
| `base` | `base` | `all` | Temel paketler, timezone, hostname, NTP |
| `hardening` | `hardening` | `all` | SSH, fail2ban, firewalld drop zone, dnf-automatic, iklim kullanıcısı |
| `docker` | `docker` | `all` | Docker Engine kurulumu ve Swarm portları |
| `node_dirs` | `node_dirs` | `all` | Node dizinleri (`/opt/iklimco/...`) |
| `storagebox` | `storagebox` | `all` | WebDAV mount ve yönetilen dizinlerin oluşturulması |
| `storagebox_ssh_key` | `storagebox_ssh_key` | `all` | StorageBox SSH key üretimi |
| `swarm` | `swarm` | `app`, `db` | Swarm init/join, overlay ağ, node label'ları |
| `db_labels` | inline task | `iklim-app-01` | DB node'larına `role=db` ve `db-index=01/02/03` label'ı ekler |
| `db_stack` | `db_stack` | `db` | StorageBox'ta MongoDB ve PostgreSQL config dizinleri |
| `act_runner` | `act_runner` | `app` | Gitea Actions runner kurulumu ve kaydı |
### Prod Bootstrap Çalıştırma Sırası
```bash
# 1. Temel sunucu hazırlıkları (tüm node'lar)
ansible-playbook prod-bootstrap.yml \
--tags base,hardening,docker,node_dirs,storagebox,storagebox_ssh_key \
--ask-vault-pass
# 2. Swarm kurulumu (app node'lar önce, ardından db node'lar)
ansible-playbook prod-bootstrap.yml \
--tags swarm \
--ask-vault-pass
# 3. DB node label'ları (Patroni koordinasyonu için)
ansible-playbook prod-bootstrap.yml \
--tags db_labels \
--ask-vault-pass
# 4. DB node konfigürasyonu (StorageBox dizin ve config dosyaları)
ansible-playbook prod-bootstrap.yml \
--tags db_stack \
--ask-vault-pass
# 5. App node runner kurulumu
ansible-playbook prod-bootstrap.yml \
--tags act_runner \
--ask-vault-pass
```
### Prod Act Runner
Runner'lar tüm app node'larında (`iklim-app-01/02/03`) systemd servisi olarak kurulur.
Label'lar `prod/group_vars/all/vars.yml` içinde tanımlıdır:
```
prod-runner
ubuntu-24.04
iklim-app-01 (veya iklim-app-02, iklim-app-03 — node'a göre değişir)
```
Kayıt token'ı `prod/group_vars/all/vault.yml` içinde `vault_gitea_runner_token` olarak tutulur. Token tanımlı değilse kayıt adımı atlanır; `.runner` dosyası varsa kayıt tekrar yapılmaz.
Gitea üzerinden token almak için: **Organization → Settings → Actions → Runners → Add Runner**
### Prod Vault Dosyası
`prod/group_vars/all/vault.yml` şifreli olarak tutulur ve şu değişkenleri içerir:
| Değişken | Açıklama |
| --- | --- |
| `vault_storagebox_password` | StorageBox WebDAV şifresi |
| `vault_iklim_password` | `iklim` sistem kullanıcısı şifresi |
| `vault_gitea_runner_token` | Gitea runner kayıt token'ı |
Şifreleme:
```bash
cd Environment_Infrastructure/ansible/prod
ansible-vault encrypt group_vars/all/vault.yml
```
Şifre çözme (düzenleme için):
```bash
ansible-vault edit group_vars/all/vault.yml
```
## Vault Kullanımı

View File

@ -4,7 +4,7 @@ remote_user = root
host_key_checking = False
retry_files_enabled = False
interpreter_python = auto_silent
roles_path = ../roles
roles_path = roles:../roles
[privilege_escalation]
become = True

View File

@ -1,4 +0,0 @@
# Global variables for prod
storagebox_account: "u469968"
admin_allowed_cidrs: "127.0.0.1/8"
timezone: "Europe/Istanbul"

View File

@ -0,0 +1,24 @@
storagebox_account: "u469968"
storagebox_user: "{{ storagebox_account }}-sub5"
storagebox_url: "https://{{ storagebox_user }}.your-storagebox.de/"
storagebox_mount_point: "/mnt/storagebox"
storagebox_password: "{{ vault_storagebox_password }}"
storagebox_managed_directories:
- path: "{{ storagebox_mount_point }}/ssl"
mode: "0755"
- path: "{{ storagebox_mount_point }}/swag/config"
mode: "0755"
- path: "{{ storagebox_mount_point }}/swag/site-confs"
mode: "0755"
- path: "{{ storagebox_mount_point }}/grafana/data"
mode: "0755"
- path: "{{ storagebox_mount_point }}/precipitation/images"
mode: "0755"
iklim_password: "{{ vault_iklim_password }}"
act_runner_labels: "prod-runner,ubuntu-24.04,{{ inventory_hostname }}"
swarm_manager_ip: "10.20.10.11"
mongodb_replset_name: "rs0"
admin_allowed_cidrs: "127.0.0.1/8"
admin_ssh_public_key_path: "~/.ssh/id_ed25519.pub"
timezone: "Europe/Istanbul"

View File

@ -0,0 +1,7 @@
# Bu dosya ansible-vault ile şifrelenmelidir:
# ansible-vault encrypt group_vars/all/vault.yml
#
# Gerçek değerleri girdikten sonra şifreleyin.
vault_storagebox_password: "CHANGE_ME"
vault_iklim_password: "CHANGE_ME"
vault_gitea_runner_token: "CHANGE_ME"

View File

@ -1,7 +0,0 @@
# Prod environment specific variables
storagebox_user: "{{ storagebox_account }}-sub5" # Prod sub-account
storagebox_url: "https://{{ storagebox_user }}.your-storagebox.de/"
storagebox_mount_point: "/mnt/storagebox"
swarm_manager_ip: "10.20.10.11"
mongodb_replset_name: "rs0"
# storagebox_password: "{{ vault_storagebox_password }}"

View File

@ -0,0 +1,24 @@
---
- name: Create StorageBox MongoDB config directory
ansible.builtin.file:
path: "{{ storagebox_mount_point }}/db/mongodb-{{ inventory_hostname.split('-')[-1] }}/config"
state: directory
mode: '0755'
- name: Create StorageBox PostgreSQL config directory
ansible.builtin.file:
path: "{{ storagebox_mount_point }}/db/postgresql-{{ inventory_hostname.split('-')[-1] }}/config"
state: directory
mode: '0755'
- name: Deploy mongod.conf to StorageBox
ansible.builtin.template:
src: mongod.conf.j2
dest: "{{ storagebox_mount_point }}/db/mongodb-{{ inventory_hostname.split('-')[-1] }}/config/mongod.conf"
mode: '0644'
- name: Deploy patroni.yml to StorageBox
ansible.builtin.template:
src: patroni.yml.j2
dest: "{{ storagebox_mount_point }}/db/postgresql-{{ inventory_hostname.split('-')[-1] }}/config/patroni.yml"
mode: '0644'

View File

@ -0,0 +1,2 @@
---
- include_tasks: db_node.yml

View File

@ -0,0 +1,18 @@
net:
port: 27017
storage:
engine: "wiredTiger"
dbPath: "/data/db"
directoryPerDB: true
systemLog:
verbosity: 0
timeStampFormat: "iso8601-local"
destination: file
path: "/data/log/mongo.log"
logAppend: true
logRotate: rename
replication:
replSetName: "{{ mongodb_replset_name }}"
security:
authorization: enabled
keyFile: "/data/configdb/rs-auth.key"

View File

@ -0,0 +1,66 @@
scope: iklim-postgres
namespace: /db/
name: postgresql-{{ inventory_hostname.split('-')[-1] }}
restapi:
listen: 0.0.0.0:8008
connect_address: patroni-{{ inventory_hostname.split('-')[-1] }}:8008
etcd3:
hosts:
- etcd-01:2379
- etcd-02:2379
- etcd-03:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
parameters:
wal_level: replica
hot_standby: "on"
wal_keep_size: 512
max_wal_senders: 5
max_replication_slots: 5
shared_preload_libraries: 'pg_stat_statements'
pg_stat_statements.track: 'all'
initdb:
- encoding: UTF8
- data-checksums
pg_hba:
- host replication replicator 10.20.20.0/24 scram-sha-256
- host all all 10.20.10.0/24 scram-sha-256
- host all all 10.20.20.0/24 scram-sha-256
users:
postgres:
password: "${POSTGRES_PASSWORD}"
options:
- superuser
postgresql:
listen: 0.0.0.0:5432
connect_address: patroni-{{ inventory_hostname.split('-')[-1] }}:5432
data_dir: /var/lib/postgresql/data/pgdata
pgpass: /tmp/pgpass0
authentication:
replication:
username: replicator
password: "${REPLICATOR_PASSWORD}"
superuser:
username: postgres
password: "${POSTGRES_PASSWORD}"
parameters:
unix_socket_directories: "/var/run/postgresql"
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false

View File

@ -2,8 +2,7 @@
act_runner_version: "0.2.12"
act_runner_arch: "linux-amd64"
act_runner_gitea_url: "https://git.tarla.io"
# -> bunu değişkene ata ve test ve prod için farklı isimler oluştur!
act_runner_name: "iklim-test-app"
act_runner_name: "{{ inventory_hostname }}"
act_runner_labels: "ubuntu-latest,ubuntu-22.04,ubuntu-20.04,test-runner:docker://catthehacker/ubuntu:act-22.04"
# Gitea'dan alınan tek seferlik registration token; kayıt olmadıysa boş bırakılır.
act_runner_registration_token: "{{ vault_gitea_runner_token | default('') }}"

View File

@ -1,3 +1,3 @@
---
# DB stack artık iklimco ana stack'inin parçası; bu role'den deploy adımı kaldırıldı.
# Bakınız: docker-stack-infra.yml — postgresql, mongodb, pg-proxy, mongo-proxy servisleri
# DB servisleri docker-stack-db.prod.yml ile ayrı bir stack olarak deploy edilir.
# Bakınız: setup/08-prod-db-cluster-kurulum.md

View File

@ -43,38 +43,3 @@ services:
# WireGuard üzerinden DB manager erişimi için köprü servisler.
# Host portları firewalld ile sadece WireGuard subnet'ine (10.8.0.0/24) açılır.
pg-proxy:
image: alpine/socat:latest
command: TCP-LISTEN:5432,fork,reuseaddr TCP:postgresql:5432
ports:
- target: 5432
published: 15432
protocol: tcp
mode: host
networks:
- iklimco-net
deploy:
placement:
constraints:
- node.labels.role == db
restart_policy:
condition: on-failure
delay: 5s
mongo-proxy:
image: alpine/socat:latest
command: TCP-LISTEN:27017,fork,reuseaddr TCP:mongodb:27017
ports:
- target: 27017
published: 17017
protocol: tcp
mode: host
networks:
- iklimco-net
deploy:
placement:
constraints:
- node.labels.role == db
restart_policy:
condition: on-failure
delay: 5s

View File

@ -34,7 +34,19 @@
state: started
enabled: yes
- name: Allow Docker traffic in firewalld
- name: Allow Docker Swarm ports in firewalld (all nodes)
ansible.posix.firewalld:
port: "{{ item }}"
permanent: yes
immediate: yes
state: enabled
loop:
- 2377/tcp
- 7946/tcp
- 7946/udp
- 4789/udp
- name: Allow web ports in firewalld (app nodes only)
ansible.posix.firewalld:
port: "{{ item }}"
permanent: yes
@ -43,7 +55,4 @@
loop:
- 80/tcp
- 443/tcp
- 2377/tcp
- 7946/tcp
- 7946/udp
- 4789/udp
when: inventory_hostname in groups['app']

View File

@ -16,7 +16,7 @@
state: present
loop:
- { regexp: "^PasswordAuthentication", line: "PasswordAuthentication no" }
- { regexp: "^PermitRootLogin", line: "PermitRootLogin prohibit-password" }
- { regexp: "^PermitRootLogin", line: "PermitRootLogin no" }
- { regexp: "^PermitEmptyPasswords", line: "PermitEmptyPasswords no" }
- { regexp: "^MaxAuthTries", line: "MaxAuthTries 3" }
notify: Restart sshd

View File

@ -16,6 +16,7 @@
- /opt/iklimco/init/postgresql
- /opt/iklimco/init/mongodb
- /opt/iklimco/stacks
- /opt/iklimco/vault/data
when: inventory_hostname in groups['app']
- name: Create db specific directories

View File

@ -34,6 +34,7 @@
ansible.builtin.shell: >
docker swarm join
--token {{ hostvars[groups['app'][0]]['manager_token']['stdout'] }}
--advertise-addr {{ private_ip }}
{{ swarm_manager_ip }}:2377
when:
- inventory_hostname in groups['app']
@ -45,6 +46,7 @@
ansible.builtin.shell: >
docker swarm join
--token {{ hostvars[groups['app'][0]]['worker_token']['stdout'] }}
--advertise-addr {{ private_ip }}
{{ swarm_manager_ip }}:2377
when:
- inventory_hostname in groups['db']

View File

@ -1,2 +1,3 @@
hetzner_floating_ip: "49.12.116.113"
hetzner_primary_interface: "eth0"
act_runner_name: "iklim-test-app"

View File

@ -136,6 +136,6 @@ not via the Gitea pipeline.
|------------|-------------|----------|
| `node.hostname == iklim-app-01` | iklim-app-01 only | SWAG, cert-reloader |
| `node.labels.type == service` | iklim-app-01, iklim-app-02, iklim-app-03 | Vault, Redis, RabbitMQ, APISIX, Prometheus, Grafana, etcd (idle in prod — APISIX uses Patroni etcd) |
| `node.labels.role == db` | iklim-db-01, iklim-db-02, iklim-db-03 | PostgreSQL, MongoDB, pg-proxy, mongo-proxy |
| `node.labels.role == db` | iklim-db-01, iklim-db-02, iklim-db-03 | PostgreSQL (Patroni), MongoDB, etcd (via `docker-stack-db.prod.yml`) |
SWAG and cert-reloader are pinned to `iklim-app-01` (the Floating IP node) because SWAG does not support clustering and must match the public entry point. Vault floats across all service nodes; its TLS cert is read from StorageBox (`/mnt/storagebox/ssl`) so it is available on whichever node Vault is scheduled on. Microservices carry no placement constraint and are distributed by the Swarm scheduler across all app nodes. DB services are pinned to DB nodes via separate DB stacks.

View File

@ -171,7 +171,7 @@ fi
## Step 4 — etcd: Separate APISIX etcd removed — Patroni etcd shared
The standalone `etcd` service in `docker-stack-infra.yml` is **not used in prod and must be removed**.
The standalone `etcd` service in `docker-stack-infra.yml` is **not used in prod and must be disabled** by setting `replicas: 0` in the prod overlay.
APISIX uses the 3-node Patroni etcd cluster running on DB nodes, via the `/apisix` prefix.
### Why consolidated?
@ -190,37 +190,45 @@ apisix:
# via apisix/conf/config.yaml or environment:
# etcd:
# host:
# - "http://iklim-db-01:2379"
# - "http://iklim-db-02:2379"
# - "http://iklim-db-03:2379"
# - "http://etcd-01:2379"
# - "http://etcd-02:2379"
# - "http://etcd-03:2379"
# prefix: "/apisix"
```
The preferred method is mounting `config.yaml` via a Docker config or volume:
The preferred method is mounting `config.yaml` via a Docker config or volume. etcd endpoints use **overlay DNS aliases** defined in `docker-stack-db.prod.yml``etcd-01`, `etcd-02`, `etcd-03` — which are reachable from app nodes via the `iklimco-net` overlay:
```yaml
# config/apisix/config.yaml
etcd:
host:
- "http://iklim-db-01:2379"
- "http://iklim-db-02:2379"
- "http://iklim-db-03:2379"
- "http://etcd-01:2379"
- "http://etcd-02:2379"
- "http://etcd-03:2379"
prefix: "/apisix"
timeout: 30
```
### Firewall requirement
### Disable standalone etcd in prod overlay
etcd access from app nodes to DB nodes must be open:
Docker Swarm overlay files cannot delete services from the base stack, but `replicas: 0` stops the container entirely:
```bash
# Each app node → each db node, port 2379
# If inside Hetzner private network it may be open by default;
# verify there are no ufw/firewalld rules blocking it:
nc -zv iklim-db-01 2379
```yaml
# docker-stack-infra.prod.yml
services:
etcd:
deploy:
replicas: 0
```
> **Note:** Docker Compose overlay files can only add/override services, not remove them. The standalone `etcd` service remains in the base stack and runs as an idle container in prod — APISIX connects to Patroni etcd instead (via config.yaml in the prod overlay). This is harmless; etcd uses negligible resources with no active clients.
### Firewall requirement
etcd access from app nodes to DB nodes must be open (port 2379, app subnet → DB subnet). Verify from an app node:
```bash
docker run --rm --network iklimco-net alpine \
sh -c "wget -qO- http://etcd-01:2379/health"
```
## Step 5 — Redis: Sentinel cluster (prod overlay)
@ -473,7 +481,7 @@ nginx_config:
## Step 8 — Create `docker-stack-infra.prod.yml`
Create this file in the repo root alongside `docker-stack-infra.yml`. It combines all prod-specific overrides from Steps 26:
Create this file in the repo root alongside `docker-stack-infra.yml`. It combines all prod-specific overrides from Steps 26 (including disabling the standalone `etcd` from Step 4):
```yaml
# docker-stack-infra.prod.yml
@ -614,6 +622,27 @@ services:
labels:
project: co.iklim
# ── Disabled in prod ─────────────────────────────────────────────────────────
etcd:
deploy:
replicas: 0
postgresql:
deploy:
replicas: 0
mongodb:
deploy:
replicas: 0
pg-proxy:
deploy:
replicas: 0
mongo-proxy:
deploy:
replicas: 0
secrets:
rabbitmq_erlang_cookie:
external: true
@ -647,11 +676,7 @@ Test uses the named Docker volume fallback (`grafana-vl`) for Grafana, and Prome
GRAFANA_DATA_DIR=/mnt/storagebox/grafana/data
```
**Create directories on StorageBox before first prod deploy:**
```bash
mkdir -p /mnt/storagebox/grafana/data
```
> `/mnt/storagebox/grafana/data` is created automatically by the Ansible `storagebox` role during bootstrap via the `storagebox_managed_directories` variable. No manual step required.
> Grafana writes its SQLite database and dashboard JSON to `/var/lib/grafana`.
> Prometheus writes its TSDB to `/prometheus` on the local `prometheus-vl` Docker volume; it is not shared between nodes.
@ -666,7 +691,7 @@ docker stack config -c docker-stack-infra.yml > /dev/null && echo "base OK"
docker stack config -c docker-stack-infra.yml -c docker-stack-infra.prod.yml > /dev/null && echo "prod merge OK"
```
## Step 9 — Database Proxies and Developer Access
## Step 11 — Database Proxies and Developer Access
In the production environment, the `pg-proxy` and `mongo-proxy` services (socat-based) defined in the base `docker-stack-infra.yml` are **deprecated and will not be used**.
@ -691,9 +716,14 @@ In the production environment, the `pg-proxy` and `mongo-proxy` services (socat-
| redis-replica | prod overlay | 2 | `node.labels.type == service`; max 1/node | Sentinel replica; spread:hostname |
| redis-sentinel | prod overlay | 3 | `node.labels.type == service`; max 1/node | Quorum=2; failover automatic |
| rabbitmq | prod overlay | 3 | `node.labels.type == service`; max 1/node | Erlang cluster; quorum queues |
| etcd | base | 1 | `node.labels.type == service` | Idle in prod — APISIX uses Patroni etcd; standalone service remains in base stack |
| etcd | prod overlay | 0 | — | Disabled (`replicas: 0`); APISIX uses Patroni etcd on DB nodes |
| postgresql | prod overlay | 0 | — | Disabled (`replicas: 0`); Patroni HA runs as `iklim-db` stack on DB nodes; port 5432 conflict |
| mongodb | prod overlay | 0 | — | Disabled (`replicas: 0`); MongoDB replica set runs as `iklim-db` stack on DB nodes; port 27017 conflict |
| pg-proxy | prod overlay | 0 | — | Deprecated; microservices use multi-host JDBC with native Patroni failover |
| mongo-proxy | prod overlay | 0 | — | Deprecated; microservices use multi-host MongoClient with native replica set failover |
| prometheus | base | 1 | `node.labels.type == service` | No native HA; Thanos is overkill at this scale |
| grafana | base | 1 | `node.labels.type == service` | Not critical |
> PostgreSQL and MongoDB run in separate DB stacks on `iklim-db-*` nodes. See `08-prod-db-cluster-kurulum.md`.
> etcd: 3-node cluster on DB nodes — APISIX shares it via `/apisix` prefix.
> Disabled services (`replicas: 0`) are removed from `docker service ls` by a post-deploy step in `deploy-prod.yml`.

View File

@ -115,7 +115,7 @@ APISIX reads its entire configuration from etcd; init script will fail silently
echo "⏳ Waiting for Patroni etcd..."
for i in $(seq 1 30); do
if docker run --rm --network iklimco-net alpine \
sh -c "wget -qO- http://iklim-db-01:2379/health 2>/dev/null | grep -q '\"health\":\"true\"'"; then
sh -c "wget -qO- http://etcd-01:2379/health 2>/dev/null | grep -q '\"health\":\"true\"'"; then
echo "✅ Patroni etcd ready"
break
fi
@ -125,7 +125,7 @@ APISIX reads its entire configuration from etcd; init script will fail silently
done
```
> **Note:** In prod, APISIX uses the 3-node Patroni etcd cluster on DB nodes (`iklim-db-01/02/03:2379`) via the `/apisix` prefix — configured in `config.yaml` mounted by the prod overlay. The standalone `etcd` service from the base stack remains idle. This step waits for Patroni etcd (`iklim-db-01:2379`) to be healthy before running the APISIX init script.
> **Note:** In prod, APISIX uses the 3-node Patroni etcd cluster on DB nodes (`etcd-01/02/03:2379`) via the `/apisix` prefix — resolved through `iklimco-net` overlay DNS aliases defined in `docker-stack-db.prod.yml`. The standalone `etcd` service from the base stack is disabled (`replicas: 0` in the prod overlay) and removed from the service list by a post-deploy step. This step waits for Patroni etcd (`etcd-01:2379`) to be healthy before running the APISIX init script.
## Step 5 — Add `Run APISIX Init` step
@ -308,7 +308,7 @@ With `cancel-in-progress: false`, a new run waits in the queue until the previou
14. **Prepare SWAG Directories** ← NEW (`$SWAG_CONFIG_DIR/dns-conf`; renders nginx conf templates)
15. Bootstrap Vault TLS Placeholder
16. Deploy Swarm Stack
17. **Wait for etcd** ← NEW (Patroni etcd `iklim-db-01:2379`)
17. **Wait for etcd** ← NEW (Patroni etcd `etcd-01:2379` overlay DNS)
18. **Run APISIX Init** ← NEW (`SPRING_PROFILES_ACTIVE=prod`)
19. **Bootstrap SWAG Certificate** ← NEW
20. **Run Database Init Scripts** ← NEW (`postgresql`, `mongodb`)

View File

@ -67,7 +67,7 @@ Minimum variables:
hcloud_token = "secret"
location = "fsn1"
image = "rocky-10"
server_type_swarm = "cpx42"
server_type_app = "cpx42"
server_type_db = "cpx32"
admin_ssh_public_key_path = "~/.ssh/id_ed25519.pub"
admin_allowed_cidrs = ["X.X.X.X/32"]
@ -86,7 +86,7 @@ The server type decision was made by considering the current test environment me
| `iklim-db-02` | `10.20.20.12` | Manual DB cluster node |
| `iklim-db-03` | `10.20.20.13` | Manual DB cluster node |
Private IPs are statically defined inside `locals.tf` as the `swarm_private_ips` and `db_private_ips` maps. The server list is derived from these maps with `for_each`.
Private IPs are statically defined inside `locals.tf` as the `app_private_ips` and `db_private_ips` maps. The server list is derived from these maps with `for_each`.
## Recommended Resources and Cost
@ -211,8 +211,8 @@ The following values can be obtained after `terraform apply` or `terraform outpu
| Output | Description |
| --- | --- |
| `ansible_inventory_yaml` | Ansible inventory YAML — written to `ansible/inventory/generated/prod.yml` |
| `prod_private_ips` | Private IP map of all nodes, with `swarm` and `db` subkeys |
| `ansible_inventory_yaml` | Ansible inventory YAML — written to `ansible/prod/inventory/generated/prod.yml` |
| `prod_private_ips` | Private IP map of all nodes, with `app` and `db` subkeys |
| `prod_public_ips` | Public IPv4 map of all nodes |
| `prod_floating_ip` | Floating IP address for the Swarm entry point; DNS A record points to this IP |
@ -220,7 +220,7 @@ To extract the Ansible inventory:
```bash
terraform output -raw ansible_inventory_yaml > \
../../ansible/inventory/generated/prod.yml
../../../ansible/prod/inventory/generated/prod.yml
```
## Lifecycle and Resize Policy
@ -290,7 +290,7 @@ After `apply`, 6 servers, 2 firewalls, 1 floating IP, and network resources are
```bash
terraform output -raw ansible_inventory_yaml > \
../../ansible/inventory/generated/prod.yml
../../../ansible/prod/inventory/generated/prod.yml
```
### Gitea Variable: `PROD_FLOATING_IP`
@ -305,7 +305,7 @@ Add the resulting IP address in Gitea -> project settings -> **Variables** with
### Resize (Change Server Type)
Change the `server_type_swarm` or `server_type_db` value inside `terraform.tfvars`:
Change the `server_type_app` or `server_type_db` value inside `terraform.tfvars`:
```bash
terraform apply
@ -326,7 +326,7 @@ Because `prevent_destroy = true` exists, normal `terraform destroy` fails. First
Then:
```bash
terraform destroy -target=hcloud_server.swarm["iklim-app-01"]
terraform destroy -target=hcloud_server.app["iklim-app-01"]
```
After completing the operation, add the lifecycle block back.

View File

@ -256,7 +256,7 @@ Applied to `iklim-app-*` nodes. Gitea Act Runner is installed on each app node a
## DB Stack Role
Applied to `iklim-db-*` nodes. On each DB node, it creates `/opt/iklimco/db` and `/opt/iklimco/backup` directories, as well as a local reference directory for MongoDB. The actual production configuration, including node-specific `mongod.conf`, replica set auth key, Patroni, and etcd configurations, is set up on StorageBox at `/mnt/storagebox/prod/db/mongodb-0X/config/`, `/mnt/storagebox/prod/db/postgresql-0X/config/`, and `/mnt/storagebox/prod/db/etcd-0X/data/` in the `08-prod-db-cluster-kurulum.md` step.
Applied to `iklim-db-*` nodes. On each DB node, it creates `/opt/iklimco/db` and `/opt/iklimco/backup` directories, as well as a local reference directory for MongoDB. The actual production configuration, including node-specific `mongod.conf`, replica set auth key, and Patroni configurations, is set up on StorageBox at `/mnt/storagebox/db/mongodb-0X/config/` and `/mnt/storagebox/db/postgresql-0X/config/` in the `08-prod-db-cluster-kurulum.md` step. etcd data is stored on local Docker named volumes (not StorageBox).
## /opt/iklimco/stacks/.env
@ -270,23 +270,15 @@ chmod 600 /opt/iklimco/stacks/.env
## StorageBox Directory Structure
After Ansible bootstrap is completed and before the infra stack is deployed, create the following directories on `iklim-app-01`; StorageBox must be mounted:
The `storagebox` Ansible rolü `storagebox_managed_directories` (`group_vars/all/vars.yml`) aracılığıyla aşağıdaki dizinleri bootstrap sırasında **otomatik** oluşturur. Manüel adım gerekmez:
```bash
# SWAG certificate and configuration directories
mkdir -p /mnt/storagebox/ssl
mkdir -p /mnt/storagebox/swag/config
mkdir -p /mnt/storagebox/swag/site-confs
- `/mnt/storagebox/ssl``SWAG_CERT_DIR`
- `/mnt/storagebox/swag/config``SWAG_CONFIG_DIR`
- `/mnt/storagebox/swag/site-confs``SWAG_SITE_CONFS_DIR`
- `/mnt/storagebox/grafana/data``GRAFANA_DATA_DIR`
- `/mnt/storagebox/precipitation/images`
# Monitoring data directories; Grafana on StorageBox, Prometheus on local volume
mkdir -p /mnt/storagebox/grafana/data
mkdir -p /mnt/storagebox/prometheus/data
# Image directory for the precipitation service
mkdir -p /mnt/storagebox/precipitation/images
```
These directories match the `SWAG_CERT_DIR`, `SWAG_CONFIG_DIR`, `SWAG_SITE_CONFS_DIR`, `GRAFANA_DATA_DIR`, and `PROMETHEUS_DATA_DIR` variables in `env-prod/.env`. Because StorageBox is mounted at the same `/mnt/storagebox` path on all app nodes, these directories are created only once and all nodes access them commonly.
StorageBox tüm app node'larında `/mnt/storagebox` olarak mount edildiğinden dizinler yalnızca bir kez oluşturulur; tüm node'lar ortaklaşa erişir. Prometheus yerel Docker named volume kullanır, StorageBox değil.
## Swarm Setup Verification
@ -319,7 +311,7 @@ grep -n "swarm init\|swarm join" init/swarm-init.sh
- `swarm-init.sh` does not attempt init again in an active Swarm; it is idempotent.
- `/mnt/storagebox` is mounted on every node.
- The `/opt/iklimco/vault/data` directory exists on every app node.
- The `ssl`, `swag/config`, `swag/site-confs`, `grafana/data`, `prometheus/data`, and `precipitation/images` directories exist on StorageBox.
- The `ssl`, `swag/config`, `swag/site-confs`, `grafana/data`, and `precipitation/images` directories exist on StorageBox.
- The Gitea Act Runner service is running on every app node.
- `/opt/iklimco/db` and `/opt/iklimco/backup` directories exist on DB nodes. Node-specific `mongod.conf` and other DB configurations are created on StorageBox (`/mnt/storagebox/prod/db/...`) in the `08-prod-db-cluster-kurulum.md` step.
- `/opt/iklimco/db` and `/opt/iklimco/backup` directories exist on DB nodes. Node-specific `mongod.conf` and other DB configurations are created on StorageBox (`/mnt/storagebox/db/...`) in the `08-prod-db-cluster-kurulum.md` step.
- Public firewall allows only `22`, `80`, and `443` ingress.

View File

@ -27,13 +27,13 @@ iklim-db-03 (Swarm worker, 10.20.20.13)
patroni-03 [Patroni + PostgreSQL — standby]
```
DB containers discover each other through **Hetzner private IPs**, not overlay DNS names. Therefore, each service publishes its port in `host` mode; replication and etcd traffic goes directly through the private network. The Hetzner Cloud firewall and the prod `db` firewall already allow these ports.
DB containers discover each other through **overlay DNS aliases** (`mongodb-01`, `etcd-01`, `patroni-01`, etc.) on the shared `iklimco-net` overlay network. Each service publishes its port in `host` mode so replication traffic goes directly through the Hetzner private network while the overlay DNS resolves service names correctly. All containers are defined in the single `docker-stack-db.prod.yml` stack file at the repo root.
## 1. Firewall Update
Verify that the following rules exist in `terraform/hetzner/prod/firewall.tf`; if any are missing, add them and run `terraform apply`.
Inside `hcloud_firewall.swarm`, from the DB subnet to Swarm ports:
Inside `hcloud_firewall.app`, from the DB subnet to Swarm ports:
```hcl
rule {
@ -169,30 +169,29 @@ docker node ls
## 3. StorageBox Directory Structure
On each DB node, where `/mnt/storagebox` must already be mounted:
DB data and logs are stored on **local Docker named volumes** (performance, WAL/compaction requirements). Only config files are placed on StorageBox. On each DB node, where `/mnt/storagebox` must already be mounted:
```bash
# On iklim-db-01:
mkdir -p /mnt/storagebox/prod/db/mongodb-01/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-01/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-01/data
mkdir -p /mnt/storagebox/db/mongodb-01/config
mkdir -p /mnt/storagebox/db/postgresql-01/config
# On iklim-db-02:
mkdir -p /mnt/storagebox/prod/db/mongodb-02/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-02/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-02/data
mkdir -p /mnt/storagebox/db/mongodb-02/config
mkdir -p /mnt/storagebox/db/postgresql-02/config
# On iklim-db-03:
mkdir -p /mnt/storagebox/prod/db/mongodb-03/{data,log,config}
mkdir -p /mnt/storagebox/prod/db/postgresql-03/{data,config}
mkdir -p /mnt/storagebox/prod/db/etcd-03/data
mkdir -p /mnt/storagebox/db/mongodb-03/config
mkdir -p /mnt/storagebox/db/postgresql-03/config
```
Config files (`mongod.conf`, `patroni.yml`) are deployed by the Ansible `db_stack` role into these directories. Named Docker volumes (`mongodb-01-data`, `etcd-01-data`, `postgresql-01-data`, etc.) are created automatically by the stack deploy.
## 4. MongoDB Replica Set
### mongod.conf
Her DB node'unda `/mnt/storagebox/prod/db/mongodb-0X/config/mongod.conf`:
Her DB node'unda `/mnt/storagebox/db/mongodb-0X/config/mongod.conf` (Ansible `db_stack` rolü tarafından deploy edilir):
```yaml
net:
@ -221,122 +220,64 @@ The **same** key file must exist on all DB nodes:
```bash
# Create on iklim-db-01:
openssl rand -base64 756 > /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key
chmod 400 /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key
openssl rand -base64 756 > /mnt/storagebox/db/mongodb-01/config/rs-auth.key
chmod 400 /mnt/storagebox/db/mongodb-01/config/rs-auth.key
# Copy the same content to the other nodes:
cat /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key \
> /mnt/storagebox/prod/db/mongodb-02/config/rs-auth.key
cat /mnt/storagebox/prod/db/mongodb-01/config/rs-auth.key \
> /mnt/storagebox/prod/db/mongodb-03/config/rs-auth.key
cat /mnt/storagebox/db/mongodb-01/config/rs-auth.key \
> /mnt/storagebox/db/mongodb-02/config/rs-auth.key
cat /mnt/storagebox/db/mongodb-01/config/rs-auth.key \
> /mnt/storagebox/db/mongodb-03/config/rs-auth.key
chmod 400 /mnt/storagebox/prod/db/mongodb-0{2,3}/config/rs-auth.key
chmod 400 /mnt/storagebox/db/mongodb-0{2,3}/config/rs-auth.key
```
### Stack File — MongoDB
`/opt/iklimco/stacks/prod-db-mongo.yml`:
MongoDB services are defined in `docker-stack-db.prod.yml` (repo root). Each service uses a named Docker volume for data and log, and a StorageBox bind mount for config:
```yaml
version: "3.8"
networks:
iklimco-net:
external: true
services:
mongodb-01:
image: mongo:8
environment:
MONGO_INITDB_ROOT_USERNAME: mongo-root
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
volumes:
- /mnt/storagebox/prod/db/mongodb-01/data:/data/db
- /mnt/storagebox/prod/db/mongodb-01/log:/data/log
- /mnt/storagebox/prod/db/mongodb-01/config:/data/configdb
- mongodb-01-data:/data/db
- mongodb-01-log:/data/log
- /mnt/storagebox/db/mongodb-01/config:/data/configdb
networks:
- iklimco-net
iklimco-net:
aliases:
- mongodb-01
ports:
- target: 27017
published: 27017
protocol: tcp
mode: host
command: ["--config", "/data/configdb/mongod.conf"]
deploy:
replicas: 1
placement:
max_replicas_per_node: 1
constraints:
- node.hostname == iklim-db-01
restart_policy:
condition: on-failure
mongodb-02:
image: mongo:8
environment:
MONGO_INITDB_ROOT_USERNAME: mongo-root
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
volumes:
- /mnt/storagebox/prod/db/mongodb-02/data:/data/db
- /mnt/storagebox/prod/db/mongodb-02/log:/data/log
- /mnt/storagebox/prod/db/mongodb-02/config:/data/configdb
networks:
- iklimco-net
ports:
- target: 27017
published: 27017
protocol: tcp
mode: host
command: ["--config", "/data/configdb/mongod.conf"]
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-02
restart_policy:
condition: on-failure
mongodb-03:
image: mongo:8
environment:
MONGO_INITDB_ROOT_USERNAME: mongo-root
MONGO_INITDB_ROOT_PASSWORD: "${MONGO_ROOT_PASSWORD}"
volumes:
- /mnt/storagebox/prod/db/mongodb-03/data:/data/db
- /mnt/storagebox/prod/db/mongodb-03/log:/data/log
- /mnt/storagebox/prod/db/mongodb-03/config:/data/configdb
networks:
- iklimco-net
ports:
- target: 27017
published: 27017
protocol: tcp
mode: host
command: ["--config", "/data/configdb/mongod.conf"]
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-03
restart_policy:
condition: on-failure
```
Volumes `mongodb-01-data`, `mongodb-01-log`, etc. are declared at the bottom of `docker-stack-db.prod.yml` and are created automatically on first deploy.
### Replica Set Initialization
Run **once** after the stack is deployed:
```bash
# On iklim-db-01:
docker exec -it $(docker ps -q -f name=iklim-db_mongodb-01) mongosh \
-u mongo-root -p "${MONGO_ROOT_PASSWORD}" --authenticationDatabase admin
# On iklim-app-01 (overlay network erişimi için):
docker run --rm -it --network iklimco-net mongo:8 \
mongosh "mongodb://mongo-root:${MONGO_ROOT_PASSWORD}@mongodb-01/admin"
# Inside mongosh:
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "10.20.20.11:27017", priority: 2 },
{ _id: 1, host: "10.20.20.12:27017", priority: 1 },
{ _id: 2, host: "10.20.20.13:27017", priority: 1 }
{ _id: 0, host: "mongodb-01:27017", priority: 2 },
{ _id: 1, host: "mongodb-02:27017", priority: 1 },
{ _id: 2, host: "mongodb-03:27017", priority: 1 }
]
})
@ -354,7 +295,7 @@ Patroni coordinates PostgreSQL primary/standby roles through etcd. If the primar
Patroni is installed on top of the `postgis/postgis:17-3.5` image. This image is pushed to Harbor and used in the stack.
`Environment_Infrastructure/docker/patroni-postgis/Dockerfile`:
`build/patroni-postgis/Dockerfile`:
```dockerfile
FROM postgis/postgis:17-3.5
@ -376,138 +317,62 @@ USER postgres
ENTRYPOINT ["patroni", "/etc/patroni/patroni.yml"]
```
Build and push; this is done with `ops/push-harbor-custom-images.sh`, or run the commands below:
Build and push is done with `ops/push-harbor-custom-images.sh`:
```bash
cd Environment_Infrastructure/docker/patroni-postgis
docker build -t registry.tarla.io/iklimco/patroni-postgis:17-3.5 .
cd /path/to/repo
bash ops/push-harbor-custom-images.sh
```
Or manually:
```bash
cd build/patroni-postgis
docker build -t registry.tarla.io/iklimco/custom-patroni-postgis:17-3.5 .
echo "$HARBOR_CI_TOKEN" | docker login registry.tarla.io -u robot-ci-push-iklimco --password-stdin
docker push registry.tarla.io/iklimco/patroni-postgis:17-3.5
docker push registry.tarla.io/iklimco/custom-patroni-postgis:17-3.5
```
### 5.2 etcd Cluster
#### Stack File — etcd
`/opt/iklimco/stacks/prod-db-etcd.yml`:
etcd services are defined in `docker-stack-db.prod.yml`. Each service uses a named Docker volume for data and has an overlay DNS alias. Environment variables reference peer URLs by alias, not by hardcoded IP:
```yaml
version: "3.8"
networks:
iklimco-net:
external: true
services:
etcd-01:
image: bitnami/etcd:3
environment:
ALLOW_NONE_AUTHENTICATION: "yes"
ETCD_NAME: etcd-01
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.11:2380
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://etcd-01:2380
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.11:2379
ETCD_ADVERTISE_CLIENT_URLS: http://etcd-01:2379
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
ETCD_INITIAL_CLUSTER: "etcd-01=http://etcd-01:2380,etcd-02=http://etcd-02:2380,etcd-03=http://etcd-03:2380"
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
volumes:
- /mnt/storagebox/prod/db/etcd-01/data:/bitnami/etcd/data
- etcd-01-data:/bitnami/etcd/data
networks:
- iklimco-net
ports:
- target: 2379
published: 2379
protocol: tcp
mode: host
- target: 2380
published: 2380
protocol: tcp
mode: host
iklimco-net:
aliases:
- etcd-01
deploy:
replicas: 1
placement:
max_replicas_per_node: 1
constraints:
- node.hostname == iklim-db-01
restart_policy:
condition: on-failure
etcd-02:
image: bitnami/etcd:3
environment:
ALLOW_NONE_AUTHENTICATION: "yes"
ETCD_NAME: etcd-02
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.12:2380
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.12:2379
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
volumes:
- /mnt/storagebox/prod/db/etcd-02/data:/bitnami/etcd/data
networks:
- iklimco-net
ports:
- target: 2379
published: 2379
protocol: tcp
mode: host
- target: 2380
published: 2380
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-02
restart_policy:
condition: on-failure
etcd-03:
image: bitnami/etcd:3
environment:
ALLOW_NONE_AUTHENTICATION: "yes"
ETCD_NAME: etcd-03
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://10.20.20.13:2380
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_ADVERTISE_CLIENT_URLS: http://10.20.20.13:2379
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_CLUSTER: "etcd-01=http://10.20.20.11:2380,etcd-02=http://10.20.20.12:2380,etcd-03=http://10.20.20.13:2380"
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER_TOKEN: iklimco-etcd-prod
volumes:
- /mnt/storagebox/prod/db/etcd-03/data:/bitnami/etcd/data
networks:
- iklimco-net
ports:
- target: 2379
published: 2379
protocol: tcp
mode: host
- target: 2380
published: 2380
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-03
restart_policy:
condition: on-failure
```
**APISIX etcd usage:** In prod, APISIX shares this etcd cluster with the `/apisix` prefix. Patroni uses the `/service/` prefix and APISIX uses the `/apisix/` prefix, so there is no collision. APISIX configuration is managed by the `config.yaml` file in the `docker-stack-infra.prod.yml` overlay; the connection is made to `http://iklim-db-01:2379,http://iklim-db-02:2379,http://iklim-db-03:2379`. Therefore, the app subnet -> DB nodes port 2379 firewall rule is mandatory; it was added in Section 1.
**APISIX etcd usage:** In prod, APISIX shares this etcd cluster with the `/apisix` prefix. Patroni uses the `/service/` prefix and APISIX uses the `/apisix/` prefix — no collision. The overlay DNS names (`etcd-01:2379`, `etcd-02:2379`, `etcd-03:2379`) are reachable from app nodes via the `iklimco-net` overlay. Therefore, the app subnet → DB nodes port 2379 firewall rule is mandatory; it was added in Section 1.
**Important:** `ETCD_INITIAL_CLUSTER_STATE` must be `new` on the first deploy and `existing` on all later deploys. If the wrong value is left in place, the data directory is reset. The deploy steps in Section 6 below detect this automatically; no manual update is required.
**Important:** `ETCD_INITIAL_CLUSTER_STATE` must be `new` on the first deploy and `existing` on all later deploys. The deploy steps in Section 6 detect this automatically; no manual update is required.
### 5.3 Patroni Configuration
A separate `patroni.yml` file is created for each node. The only differences are the `name` and `connect_address` fields.
`patroni.yml` is generated per-node by the Ansible `db_stack` role from `templates/patroni.yml.j2` using `inventory_hostname` (e.g., `iklim-db-01`). The generated file uses overlay DNS aliases for all addresses.
**Node 01** — `/mnt/storagebox/prod/db/postgresql-01/config/patroni.yml`:
**Generated output — Node 01** (`/mnt/storagebox/db/postgresql-01/config/patroni.yml`):
```yaml
scope: iklim-postgres
@ -516,13 +381,13 @@ name: postgresql-01
restapi:
listen: 0.0.0.0:8008
connect_address: 10.20.20.11:8008
connect_address: patroni-01:8008
etcd3:
hosts:
- 10.20.20.11:2379
- 10.20.20.12:2379
- 10.20.20.13:2379
- etcd-01:2379
- etcd-02:2379
- etcd-03:2379
bootstrap:
dcs:
@ -558,7 +423,7 @@ bootstrap:
postgresql:
listen: 0.0.0.0:5432
connect_address: 10.20.20.11:5432
connect_address: patroni-01:5432
data_dir: /var/lib/postgresql/data/pgdata
pgpass: /tmp/pgpass0
authentication:
@ -578,58 +443,26 @@ tags:
nosync: false
```
**Node 02** — `/mnt/storagebox/prod/db/postgresql-02/config/patroni.yml`:
Same content as Node 01; only the following fields differ:
```yaml
name: postgresql-02
restapi:
connect_address: 10.20.20.12:8008
postgresql:
connect_address: 10.20.20.12:5432
data_dir: /var/lib/postgresql/data/pgdata
```
**Node 03** — `/mnt/storagebox/prod/db/postgresql-03/config/patroni.yml`:
```yaml
name: postgresql-03
restapi:
connect_address: 10.20.20.13:8008
postgresql:
connect_address: 10.20.20.13:5432
data_dir: /var/lib/postgresql/data/pgdata
```
For Node 02 and 03, only `name`, `restapi.connect_address`, and `postgresql.connect_address` differ (`postgresql-02`/`patroni-02:8008`/`patroni-02:5432`, etc.).
### 5.4 Stack File — Patroni
`/opt/iklimco/stacks/prod-db-patroni.yml`:
Patroni services are defined in `docker-stack-db.prod.yml`. Each service uses the custom image, a named Docker volume for data, a StorageBox bind mount for the config file, and overlay DNS aliases:
```yaml
version: "3.8"
patroni-01:
image: registry.tarla.io/iklimco/custom-patroni-postgis:17-3.5
environment:
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
TZ: "Europe/Istanbul"
volumes:
- postgresql-01-data:/var/lib/postgresql/data
- /mnt/storagebox/db/postgresql-01/config/patroni.yml:/etc/patroni/patroni.yml:ro
networks:
iklimco-net:
external: true
services:
patroni-01:
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
environment:
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
TZ: "Europe/Istanbul"
volumes:
- /mnt/storagebox/prod/db/postgresql-01/data:/var/lib/postgresql/data
- /mnt/storagebox/prod/db/postgresql-01/config/patroni.yml:/etc/patroni/patroni.yml:ro
networks:
- iklimco-net
aliases:
- patroni-01
ports:
- target: 5432
published: 5432
@ -642,103 +475,46 @@ services:
deploy:
replicas: 1
placement:
max_replicas_per_node: 1
constraints:
- node.hostname == iklim-db-01
restart_policy:
condition: on-failure
patroni-02:
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
environment:
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
TZ: "Europe/Istanbul"
volumes:
- /mnt/storagebox/prod/db/postgresql-02/data:/var/lib/postgresql/data
- /mnt/storagebox/prod/db/postgresql-02/config/patroni.yml:/etc/patroni/patroni.yml:ro
networks:
- iklimco-net
ports:
- target: 5432
published: 5432
protocol: tcp
mode: host
- target: 8008
published: 8008
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-02
restart_policy:
condition: on-failure
patroni-03:
image: registry.tarla.io/iklimco/patroni-postgis:17-3.5
environment:
DATABASE_POSTGRES_ROOT_USER: "${DATABASE_POSTGRES_ROOT_USER}"
POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
REPLICATOR_PASSWORD: "${REPLICATOR_PASSWORD}"
TZ: "Europe/Istanbul"
volumes:
- /mnt/storagebox/prod/db/postgresql-03/data:/var/lib/postgresql/data
- /mnt/storagebox/prod/db/postgresql-03/config/patroni.yml:/etc/patroni/patroni.yml:ro
networks:
- iklimco-net
ports:
- target: 5432
published: 5432
protocol: tcp
mode: host
- target: 8008
published: 8008
protocol: tcp
mode: host
deploy:
replicas: 1
placement:
constraints:
- node.hostname == iklim-db-03
restart_policy:
condition: on-failure
```
Volumes `postgresql-01-data`, `postgresql-02-data`, `postgresql-03-data` are declared at the bottom of `docker-stack-db.prod.yml` and created automatically on first deploy.
### 5.5 Status Check
```bash
# On any DB node:
docker exec -it $(docker ps -q -f name=iklim-patroni_patroni-01) \
# On iklim-app-01 — Patroni cluster status:
docker exec -it $(docker ps -q -f name=iklim-db_patroni-01 | head -1) \
patronictl -c /etc/patroni/patroni.yml list
```
Expected output: one `Leader` row and two `Replica` rows, all with the `State` column set to `running`.
```bash
# etcd cluster health:
docker exec -it $(docker ps -q -f name=iklim-etcd_etcd-01) \
etcdctl endpoint health \
--endpoints=http://10.20.20.11:2379,http://10.20.20.12:2379,http://10.20.20.13:2379
# etcd cluster health (from app node via overlay):
docker run --rm --network iklimco-net alpine \
sh -c "wget -qO- http://etcd-01:2379/health && \
wget -qO- http://etcd-02:2379/health && \
wget -qO- http://etcd-03:2379/health"
```
```bash
# Find the current primary:
docker exec -it $(docker ps -q -f name=iklim-patroni_patroni-01) \
docker exec -it $(docker ps -q -f name=iklim-db_patroni-01 | head -1) \
patronictl -c /etc/patroni/patroni.yml topology
```
## 6. Deploy
Order matters: etcd first, then the MongoDB and Patroni stacks.
All DB services (etcd, MongoDB, Patroni) are in the single `docker-stack-db.prod.yml` stack. Deploy from `iklim-app-01` in the repo working directory.
### .env File
The `/opt/iklimco/stacks/.env` file is stored on StorageBox as `prod/secrets/iklim.co/.env.stacks`. When it is created the first time, it is filled with strong passwords and uploaded to StorageBox; later deploys fetch it from there:
The `/opt/iklimco/stacks/.env` file is stored on StorageBox as `prod/secrets/iklim.co/.env.stacks`. Fetch it once before first deploy:
```bash
# On iklim-app-01, once:
scp -P 23 STORAGEBOX_USER@STORAGEBOX_USER.your-storagebox.de:prod/secrets/iklim.co/.env.stacks \
/opt/iklimco/stacks/.env
chmod 600 /opt/iklimco/stacks/.env
@ -756,36 +532,32 @@ MONGO_ROOT_PASSWORD=<strong-password>
### Deploy Steps
```bash
# On iklim-app-01 (Swarm manager):
# On iklim-app-01, in the repo working directory:
export $(cat /opt/iklimco/stacks/.env | xargs)
# Automatic ETCD_INITIAL_CLUSTER_STATE detection — 'new' on first deploy, 'existing' afterwards
ETCD_STATE="new"
if docker service ls --filter name=iklim-etcd -q 2>/dev/null | grep -q .; then
echo " etcd services exist, using 'existing' state..."
ETCD_STATE="existing"
# Automatic ETCD_INITIAL_CLUSTER_STATE detection:
DEPLOY_FILE="docker-stack-db.prod.yml"
if docker service ls --filter name=iklim-db_etcd-01 -q 2>/dev/null | grep -q .; then
echo " etcd services mevcut, 'existing' ile deploy ediliyor..."
DEPLOY_FILE=$(mktemp /tmp/docker-stack-db.XXXXXX.yml)
sed "s/ETCD_INITIAL_CLUSTER_STATE: new/ETCD_INITIAL_CLUSTER_STATE: existing/g" \
docker-stack-db.prod.yml > "$DEPLOY_FILE"
else
echo " First deploy, using 'new' state..."
echo " İlk deploy, 'new' state kullanılıyor..."
fi
sed -i \
"s/ETCD_INITIAL_CLUSTER_STATE: new/ETCD_INITIAL_CLUSTER_STATE: ${ETCD_STATE}/g; \
s/ETCD_INITIAL_CLUSTER_STATE: existing/ETCD_INITIAL_CLUSTER_STATE: ${ETCD_STATE}/g" \
/opt/iklimco/stacks/prod-db-etcd.yml
echo "✅ ETCD_INITIAL_CLUSTER_STATE=${ETCD_STATE}"
# 1. etcd cluster:
docker stack deploy \
--compose-file /opt/iklimco/stacks/prod-db-etcd.yml \
--with-registry-auth \
iklim-etcd
-c "$DEPLOY_FILE" \
iklim-db
# Wait for the etcd cluster to be ready:
[ "$DEPLOY_FILE" != "docker-stack-db.prod.yml" ] && rm -f "$DEPLOY_FILE"
# Wait for etcd cluster to be ready:
echo "⏳ etcd bekleniyor..."
for i in $(seq 1 18); do
if docker exec $(docker ps -q -f name=iklim-etcd_etcd-01 | head -1) \
etcdctl endpoint health \
--endpoints=http://10.20.20.11:2379,http://10.20.20.12:2379,http://10.20.20.13:2379 \
2>/dev/null | grep -q "is healthy"; then
if docker run --rm --network iklimco-net alpine \
sh -c "wget -qO- http://etcd-01:2379/health 2>/dev/null | grep -q '\"health\":\"true\"'"; then
echo "✅ etcd ready"
break
fi
@ -794,45 +566,42 @@ for i in $(seq 1 18); do
sleep 10
done
# 2. MongoDB:
docker stack deploy \
--compose-file /opt/iklimco/stacks/prod-db-mongo.yml \
--with-registry-auth \
iklim-db
# 3. Patroni (PostgreSQL):
docker stack deploy \
--compose-file /opt/iklimco/stacks/prod-db-patroni.yml \
--with-registry-auth \
iklim-patroni
docker stack services iklim-etcd
docker stack services iklim-db
docker stack services iklim-patroni
```
### DB Node Placement Check
```bash
docker service ps iklim-db_etcd-01
docker service ps iklim-db_mongodb-01
docker service ps iklim-db_patroni-01
```
All tasks must run on the expected `iklim-db-*` nodes.
### MongoDB Replica Set Initialization
Run once after the MongoDB stack is deployed:
Run once after the stack is deployed:
```bash
docker exec -it $(docker ps -q -f name=iklim-db_mongodb-01) mongosh \
-u mongo-root -p "${MONGO_ROOT_PASSWORD}" --authenticationDatabase admin
# From iklim-app-01 via overlay network:
docker run --rm -it --network iklimco-net mongo:8 \
mongosh "mongodb://mongo-root:${MONGO_ROOT_PASSWORD}@mongodb-01/admin"
# Inside mongosh:
rs.initiate({
_id: "rs0",
members: [
{ _id: 0, host: "10.20.20.11:27017", priority: 2 },
{ _id: 1, host: "10.20.20.12:27017", priority: 1 },
{ _id: 2, host: "10.20.20.13:27017", priority: 1 }
{ _id: 0, host: "mongodb-01:27017", priority: 2 },
{ _id: 1, host: "mongodb-02:27017", priority: 1 },
{ _id: 2, host: "mongodb-03:27017", priority: 1 }
]
})
```
## 7. Access from App Services
App containers connect to DB services through the `iklimco-net` overlay network **by Swarm DNS name**. Because the MongoDB stack (`iklim-db`) and Patroni stack (`iklim-patroni`) share the `iklimco-net` external network, service names are resolved through overlay DNS.
App containers connect to DB services through the `iklimco-net` overlay network by **overlay DNS name**. Because the `iklim-db` stack shares the `iklimco-net` external network, service names and aliases are resolved through overlay DNS.
### MongoDB Replica Set Connection String
@ -873,15 +642,13 @@ postgresql://<user>@patroni-01:5432,patroni-02:5432,patroni-03:5432/<db>?targetS
> For direct testing, from outside the overlay with private IP:
> `postgresql://postgres@10.20.20.11:5432,10.20.20.12:5432,10.20.20.13:5432/postgres?targetServerType=primary`
The PostgreSQL JDBC/libpq driver connects to all listed nodes with `targetServerType=primary` and automatically finds the primary.
### Patroni REST API
Patroni exposes an HTTP endpoint on port 8008. This endpoint can be used with HAProxy or a similar load balancer to route to the primary automatically:
```bash
# Primary check (HTTP 200 = primary, HTTP 503 = replica):
curl -s http://10.20.20.11:8008/primary
curl -s http://patroni-01:8008/primary
```
## 8. Geliştirici ve Ofis Erişimi (Production)
@ -902,14 +669,12 @@ Modern veritabanı araçları (DBeaver, Compass vb.) küme farkındalıklı bağ
## Acceptance Criteria
- `docker stack services iklim-etcd` — three services `1/1`
- `docker stack services iklim-db` — three MongoDB services `1/1`
- `docker stack services iklim-patroni` — three Patroni services `1/1`
- In the output of `docker service ps iklim-patroni_patroni-01`, `patroni-02`, and `patroni-03`, every task runs on an `iklim-db-*` node through the `role=db` placement constraint.
- In the output of `docker service ps iklim-db_mongodb-01`, `mongodb-02`, and `mongodb-03`, every task runs on an `iklim-db-*` node.
- In the output of `docker service ps iklim-etcd_etcd-01`, `etcd-02`, and `etcd-03`, every task runs on an `iklim-db-*` node.
- `docker stack services iklim-db` — 9 services visible (etcd-01/02/03, mongodb-01/02/03, patroni-01/02/03), all `1/1`
- `docker service ps iklim-db_patroni-01/02/03` — each task runs on its expected `iklim-db-*` node
- `docker service ps iklim-db_mongodb-01/02/03` — each task runs on its expected `iklim-db-*` node
- `docker service ps iklim-db_etcd-01/02/03` — each task runs on its expected `iklim-db-*` node
- `patronictl list` — 1 `Leader`, 2 `Replica`, all `running`
- `etcdctl endpoint health` — three endpoints `healthy`
- etcd health endpoint returns `"health":"true"` on all three nodes via overlay
- `rs.status()` — 1 PRIMARY, 2 SECONDARY
- MongoDB and PostgreSQL are reachable from app nodes.
- Ports `5432`, `27017`, `2379`, `2380`, and `8008` are closed from the public internet.

View File

@ -34,12 +34,10 @@ Shared labels on all prod runners:
```text
prod-runner
docker
swarm-manager
ubuntu-24.04
```
Node-specific labels:
Node-specific labels (hostname of each app node):
```text
iklim-app-01
@ -208,7 +206,7 @@ All prod deploy workflows, including infra and microservices, must use the same
| 14 | **Prepare SWAG Directories** * | `$SWAG_CONFIG_DIR/dns-conf`; renders nginx conf templates; reloads running SWAG |
| 15 | Bootstrap Vault TLS Placeholder | |
| 16 | Deploy Swarm Stack | base + prod overlay together |
| 17 | **Wait for etcd** * | Waits until Patroni etcd (`iklim-db-01:2379`) is healthy |
| 17 | **Wait for etcd** * | Waits until Patroni etcd (`etcd-01:2379`) is healthy |
| 18 | **Run APISIX Init** * | `SPRING_PROFILES_ACTIVE=prod`; idempotent; writes to etcd |
| 19 | **Bootstrap SWAG Certificate** * | Waits for SWAG to obtain the cert; copies it to `SWAG_CERT_DIR` |
| 20 | **Run Database Init Scripts** * | `postgresql`/`mongodb` Swarm VIP; SQL+JS init; idempotent |
@ -646,7 +644,7 @@ Expected: valid JSON weather response.
- `prod/secrets/iklim.co/.env.secrets.swag` exists on StorageBox and contains valid GoDaddy credentials.
- `PROD_FLOATING_IP` project variable is defined in Gitea.
- `redis_password` and `rabbitmq_erlang_cookie` appear in `docker secret ls`.
- The `ssl`, `swag/config`, `swag/site-confs`, `grafana/data`, `prometheus/data`, and `precipitation/images` directories exist on StorageBox; see `07-prod-ansible-bootstrap.md` — StorageBox Directory Structure.
- The `ssl`, `swag/config`, `swag/site-confs`, `grafana/data`, and `precipitation/images` directories exist on StorageBox; see `07-prod-ansible-bootstrap.md` — StorageBox Directory Structure.
- The `swag/site-confs/default.conf`, `api.conf.tpl`, `apigw.conf.tpl`, `rabbitmq.conf.tpl`, and `grafana.conf.tpl` template files exist in the repo.
- StorageBox `prod/secrets/iklim.co/.env.prod` has correct values for `API_SUBDOMAIN`, `APIGW_SUBDOMAIN`, `RABBITMQ_SUBDOMAIN`, `GRAFANA_SUBDOMAIN`, `RESTRICTED_IPS`, `SWAG_CERT_DIR`, `SWAG_CONFIG_DIR`, and `SWAG_SITE_CONFS_DIR`.
- After the first deploy, `docker exec $(docker ps -q -f name=iklimco_swag) nginx -t` succeeds and returns `syntax is ok`.
@ -655,6 +653,7 @@ Expected: valid JSON weather response.
- The `registry.tarla.io/iklimco/custom-apisix:3.12.0` image exists in Harbor and its `config.yaml` contains `set_real_ip_from: 10.0.0.0/8` configuration.
- After the first deploy, real client IP appears in APISIX access logs, not the SWAG overlay IP: `docker exec $(docker ps -q -f name=iklimco_apisix | head -1) tail -5 /usr/local/apisix/logs/access.log`
- `docker service ps iklimco_cert-reloader` shows that the service is running.
- `docker service ls` does not contain `iklimco_etcd`, `iklimco_postgresql`, `iklimco_mongodb`, `iklimco_pg-proxy`, or `iklimco_mongo-proxy`; they are removed by the post-deploy step in `deploy-prod.yml` (base stack services superseded by the `iklim-db` stack or deprecated in prod).
- The output of `docker service logs iklimco_cert-reloader --tail 20` contains `[cert-reloader] started` and has no error lines.
- The `notAfter` date of the Vault TLS endpoint certificate matches `/mnt/storagebox/ssl/STAR.iklim.co.full.crt`: `docker exec $(docker ps -q -f name=iklimco_vault | head -1) sh -c 'echo | openssl s_client -connect vault.iklim.co:8200 2>/dev/null | openssl x509 -noout -dates'`
- `vault operator raft list-peers` returns 3 peers: 1 leader, 2 followers.

186
terraform/hetzner/README.md Normal file
View File

@ -0,0 +1,186 @@
# Terraform — iklim.co Hetzner Cloud Altyapısı
Bu dizin, iklim.co test ve prod ortamlarının Hetzner Cloud altyapısını Terraform ile yönetir.
## Dizin Yapısı
```text
terraform/hetzner/
test/ — test ortamı: 1 app + 1 db node, single-node Swarm
prod/ — prod ortamı: 3 app + 3 db node, 3-manager HA Swarm
```
Her ortam kendi bağımsız Terraform state dosyasına sahiptir; birbirini etkilemez.
## Ortam Karşılaştırması
| Özellik | Test | Prod |
| --- | --- | --- |
| App node sayısı | 1 (`iklim-app-01`) | 3 (`iklim-app-01/02/03`) |
| DB node sayısı | 1 (`iklim-db-01`) | 3 (`iklim-db-01/02/03`) |
| App sunucu tipi | `cpx42` | `cpx42` |
| DB sunucu tipi | `cpx42` | `cpx32` |
| Swarm mimarisi | Single-node | 3-manager HA |
| App subnet | `10.10.10.0/24` | `10.20.10.0/24` |
| DB subnet | `10.10.20.0/24` | `10.20.20.0/24` |
| Floating IP | Var | Var |
| Placement group | Yok | Spread (farklı fiziksel host) |
## Ön Koşullar
- Terraform >= 1.5 kurulu olmalı.
- Hetzner Cloud ortama özel proje API token'ı hazır olmalı (test ve prod ayrı proje).
- `terraform.tfvars` dosyası oluşturulmuş olmalı (bkz. `terraform.tfvars.example`).
## terraform.tfvars Kurulumu
```bash
# Test için:
cd terraform/hetzner/test
cp terraform.tfvars.example terraform.tfvars
# Prod için:
cd terraform/hetzner/prod
cp terraform.tfvars.example terraform.tfvars
```
`terraform.tfvars` içindeki değişkenler:
| Değişken | Açıklama |
| --- | --- |
| `hcloud_token` | Hetzner Cloud ortama özel proje API token'ı |
| `location` | Sunucu lokasyonu (örn. `fsn1`) |
| `image` | Sunucu işletim sistemi (örn. `rocky-10`) |
| `server_type_app` | App sunucu tipi |
| `server_type_db` | DB sunucu tipi |
| `admin_ssh_public_key_path` | Admin SSH public key dosya yolu |
| `admin_allowed_cidrs` | SSH erişimine izin verilen CIDR listesi |
> `terraform.tfvars` hassas bilgi içerir; repository'e commit edilmez.
## Terraform Komutları
Tüm komutlar ilgili ortam dizininden çalıştırılmalıdır:
```bash
# Test:
cd terraform/hetzner/test
# Prod:
cd terraform/hetzner/prod
```
### Başlatma
```bash
terraform init
```
### Değişiklik önizleme
```bash
terraform plan
```
### Uygulama
```bash
terraform apply
```
### Kaynakları kaldırma
```bash
terraform destroy
```
## Ansible Inventory Üretimi
`terraform apply` tamamlandıktan sonra Ansible inventory'si şu komutlarla üretilir:
```bash
# Test inventory:
cd terraform/hetzner/test
terraform output -raw ansible_inventory_yaml \
> ../../../ansible/test/inventory/generated/test.yml
# Prod inventory:
cd terraform/hetzner/prod
terraform output -raw ansible_inventory_yaml \
> ../../../ansible/prod/inventory/generated/prod.yml
```
Sunucu ekleme/silme veya IP değişimi sonrası inventory yeniden üretilmelidir.
## Outputs
### Test
| Output | Açıklama |
| --- | --- |
| `ansible_inventory_yaml` | Ansible inventory YAML |
| `test_private_ips` | Node private IP haritası |
| `test_public_ips` | Node public IPv4 haritası |
| `test_floating_ip` | Swarm giriş noktası floating IP |
### Prod
| Output | Açıklama |
| --- | --- |
| `ansible_inventory_yaml` | Ansible inventory YAML |
| `prod_private_ips` | Node private IP haritası (`app` ve `db` alt anahtarlarıyla) |
| `prod_public_ips` | Node public IPv4 haritası |
| `prod_floating_ip` | Swarm giriş noktası floating IP — DNS A kaydı bu IP'ye yönlendirilir |
```bash
# Floating IP'yi görmek için:
terraform output prod_floating_ip # veya test_floating_ip
```
## Güvenlik Duvarı Özeti
### App Firewall (her iki ortam)
| Port | Protokol | Kaynak | Açıklama |
| --- | --- | --- | --- |
| `22` | TCP | `admin_allowed_cidrs` | SSH |
| `80` | TCP | 0.0.0.0/0 | HTTP (SWAG) |
| `443` | TCP | 0.0.0.0/0 | HTTPS (SWAG) |
| `2377` | TCP | DB subnet | Docker Swarm control plane |
| `7946` | TCP/UDP | DB subnet | Docker Swarm node discovery |
| `4789` | UDP | DB subnet | Docker Swarm VXLAN overlay |
### DB Firewall (her iki ortam)
App subnet kaynaklı:
| Port | Açıklama |
| --- | --- |
| `22/tcp` | SSH |
| `5432/tcp` | PostgreSQL |
| `27017/tcp` | MongoDB |
| `2377/tcp` | Docker Swarm control plane |
| `7946/tcp,udp` | Docker Swarm node discovery |
| `4789/udp` | Docker Swarm VXLAN overlay |
Prod'a özgü (app subnet kaynaklı):
| Port | Açıklama |
| --- | --- |
| `2379/tcp` | etcd client (Patroni + APISIX) |
Prod'a özgü (DB subnet içi, karşılıklı):
| Port | Açıklama |
| --- | --- |
| `5432/tcp` | Patroni replikasyon |
| `27017/tcp` | MongoDB replica set internal |
| `2379/tcp` | etcd client |
| `2380/tcp` | etcd peer |
| `8008/tcp` | Patroni REST API |
> IP kısıtlaması (admin paneli, dashboard vb.) Hetzner Firewall'da değil, SWAG nginx konfigürasyonunda yapılır.
## Sonraki Adım
Terraform apply ve inventory üretiminin ardından Ansible bootstrap çalıştırılır. Detaylar için `ansible/README.md` dosyasına bakın.

View File

@ -218,6 +218,14 @@ resource "hcloud_firewall" "db" {
description = "MongoDB replica set internal traffic"
}
rule {
direction = "in"
protocol = "tcp"
port = "2379"
source_ips = [local.app_subnet_cidr]
description = "etcd client (Patroni + APISIX) from app subnet"
}
rule {
direction = "in"
protocol = "tcp"

View File

@ -6,11 +6,14 @@ output "ansible_inventory_yaml" {
children = {
app = {
hosts = {
for name, server in hcloud_server.app : name => {
for name, server in hcloud_server.app : name => merge(
{
ansible_host = server.ipv4_address
ansible_user = "root"
private_ip = local.app_private_ips[name]
}
},
name == "iklim-app-01" ? { hetzner_floating_ip = hcloud_floating_ip.app.ip_address } : {}
)
}
}
db = {

View File

@ -7,7 +7,7 @@ resource "hcloud_server" "app" {
for_each = local.app_private_ips
name = each.key
server_type = var.server_type_swarm
server_type = var.server_type_app
image = var.image
location = var.location
ssh_keys = [hcloud_ssh_key.admin.id]

View File

@ -16,10 +16,10 @@ variable "image" {
description = "Server image"
}
variable "server_type_swarm" {
variable "server_type_app" {
type = string
default = "cpx42"
description = "Hetzner server type for App/Swarm nodes"
description = "Hetzner server type for app nodes"
}
variable "server_type_db" {