Environment_Infrastructure/setup/06-prod-terraform-iaac.md
Murat ÖZDEMİR 4c3b7faad6 docs(roadmap): update production environment roadmap and setup guides
- Documented infrastructure changes for Redis Sentinel and RabbitMQ clustering.
- Updated setup guides for Terraform, Ansible, and Swarm node recovery.
- Clarified APISIX rate limit policy and degradation settings.
2026-05-17 18:54:44 +03:00

358 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 06 - Prod Terraform IaC
Bu asamanin amaci prod Hetzner Cloud Project icinde HA odakli IaaS kaynaklarini Terraform ile olusturmaktir. Bu dokuman prod Terraform ajanina tek basina verilebilir.
## Kapsam
Terraform prod ortaminda sunlari olusturur:
- Private network: `iklim-prod-net`
- Subnetler:
- App/Swarm subnet: `10.20.10.0/24`
- DB subnet: `10.20.20.0/24`
- Firewall:
- Public ingress: sadece `22/tcp`, `80/tcp`, `443/tcp`
- Private ingress: `01-private-network-port-matrisi.md` dosyasindaki prod kurallari
- SSH key
- Placement groups:
- `iklim-prod-app-spread`
- `iklim-prod-db-spread`
- Floating IP: app entry point icin sabit IPv4 (`iklim-app-01`'e atanir)
- Servers:
- `iklim-app-01`
- `iklim-app-02`
- `iklim-app-03`
- `iklim-db-01`
- `iklim-db-02`
- `iklim-db-03`
- Ansible inventory output
DB cluster yazilimi Terraform ile kurulmayacak. DB node'lari sadece makine, network ve firewall seviyesinde hazirlanacak.
## Versiyon Gereksinimleri
```text
Terraform >= 1.6
hcloud provider ~> 1.49
```
## Onerilen Dosya Yapisi
```text
terraform/
hetzner/
prod/
versions.tf
providers.tf
variables.tf
locals.tf
network.tf
firewall.tf
placement.tf
servers.tf
floating_ip.tf
outputs.tf
terraform.tfvars.example
```
`terraform.tfvars`, state dosyalari ve token repo'ya commit edilmeyecek.
## Degiskenler
`environment` sabiti `locals.tf` icindedir; `tfvars` ile override edilmez.
Minimum degiskenler:
```hcl
hcloud_token = "secret"
location = "fsn1"
image = "rocky-10"
server_type_swarm = "cpx42"
server_type_db = "cpx32"
admin_ssh_public_key_path = "~/.ssh/id_ed25519.pub"
admin_allowed_cidrs = ["X.X.X.X/32"]
```
Server type karari `../hetzner-sizing-report.md` dokumanindaki mevcut test
ortami metrikleri ve prod cluster topolojisi dikkate alinarak belirlenmistir.
Prod app node'lar icin Java mikroservis bellek baskisi nedeniyle `cpx42`,
prod DB node'lar icin ise 3 node cluster baslangici nedeniyle ekonomik
`cpx32` onerilir. Kapasite ihtiyaci metriklerle dogrulandiginda node ekleme
veya in-place rescale yapilabilir.
## Server Rolleri ve Private IP Plani
| Server | Private IP | Rol |
| --- | --- | --- |
| `iklim-app-01` | `10.20.10.11` | Swarm manager + app worker + runner (primary, FIP alir) |
| `iklim-app-02` | `10.20.10.12` | Swarm manager + app worker + runner |
| `iklim-app-03` | `10.20.10.13` | Swarm manager + app worker + runner |
| `iklim-db-01` | `10.20.20.11` | Manuel DB cluster node |
| `iklim-db-02` | `10.20.20.12` | Manuel DB cluster node |
| `iklim-db-03` | `10.20.20.13` | Manuel DB cluster node |
Private IP'ler `locals.tf` icinde `swarm_private_ips` ve `db_private_ips` map'leri olarak sabit tanimlanir. Sunucu listesi `for_each` ile bu map'lerden turetilir.
## Onerilen Kaynaklar ve Maliyet
| Server | Rol | Server Type | CPU | RAM | SSD | Aylik |
| --- | --- | --- | ---: | ---: | ---: | ---: |
| `iklim-app-01` | Swarm manager + app worker + runner | `cpx42` | 8 AMD | 16 GB | 320 GB | $29.99 |
| `iklim-app-02` | Swarm manager + app worker + runner | `cpx42` | 8 AMD | 16 GB | 320 GB | $29.99 |
| `iklim-app-03` | Swarm manager + app worker + runner | `cpx42` | 8 AMD | 16 GB | 320 GB | $29.99 |
| `iklim-db-01` | DB cluster node | `cpx32` | 4 AMD | 8 GB | 160 GB | $16.49 |
| `iklim-db-02` | DB cluster node | `cpx32` | 4 AMD | 8 GB | 160 GB | $16.49 |
| `iklim-db-03` | DB cluster node | `cpx32` | 4 AMD | 8 GB | 160 GB | $16.49 |
| **Toplam** | 6 server | | **36 vCPU** | **72 GB** | **1,440 GB** | **$139.44** |
## Placement Group Karari
Prod icin iki ayri spread placement group:
```text
iklim-prod-app-spread: iklim-app-01/02/03
iklim-prod-db-spread: iklim-db-01/02/03
```
Bu sayede Swarm quorum node'lari kendi aralarinda farkli fiziksel host'lara, DB node'lari da kendi aralarinda farkli fiziksel host'lara yerlestirilmeye calisilir.
Notlar:
- Hetzner kabinet secimi dogrudan sunmaz.
- Spread placement group farkli fiziksel host hedefler.
- Farkli lokasyon/region felaket kurtarma bu asamada konu disidir.
- Ileride scale buyudugunde multi-location DR ayri tasarlanmalidir.
## Floating IP
`iklim-prod-app-fip` adli IPv4 floating IP olusturulur ve `iklim-app-01`'e atanir. DNS A kaydi bu IP'ye yonlendirilir. Failover gerekirse floating IP baska bir app node'una tasinabilir.
## Public Firewall
Public ingress:
| Port | Kaynak | Hedef |
| --- | --- | --- |
| `22/tcp` | `admin_allowed_cidrs` | Tum prod node'lari |
| `80/tcp` | `0.0.0.0/0`, `::/0` | `iklim-app-*` (Floating IP uzerinden) |
| `443/tcp` | `0.0.0.0/0`, `::/0` | `iklim-app-*` (Floating IP uzerinden) |
Prod'da su portlar public acilmayacak:
- `8200/tcp` Vault
- `5432/tcp` PostgreSQL
- `27017/tcp` MongoDB
- `6379/tcp` Redis
- `5672/tcp`, `15672/tcp`, `61613/tcp`, `15674/tcp` RabbitMQ
- `2377/tcp`, `7946/tcp`, `7946/udp`, `4789/udp` Docker Swarm
- `9180/tcp` APISIX Admin API
- `9090/tcp` Prometheus
- `3000/tcp` Grafana
## Private Firewall
### App (swarm) Firewall — Private Ingress
App subnet kaynakli (`10.20.10.0/24`):
| Port | Servis | Erisim yontemi |
| --- | --- | --- |
| `2377/tcp` | Docker Swarm control plane | App subnet icinden |
| `7946/tcp,udp` | Docker Swarm node discovery | App subnet icinden |
| `4789/udp` | Docker Swarm VXLAN overlay | App subnet icinden |
| `8200/tcp` | Vault | Docker overlay / private network |
| `6379/tcp` | Redis | App subnet icinden |
| `5672/tcp` | RabbitMQ AMQP | App subnet icinden |
| `61613/tcp` | RabbitMQ STOMP | App subnet icinden |
| `15674/tcp` | RabbitMQ Web STOMP | App subnet icinden |
| `15672/tcp` | RabbitMQ Management | SWAG arkasinda `443` — IP kisitli |
| `9000/tcp` | APISIX Dashboard | SWAG arkasinda `443` — IP kisitli |
| `9180/tcp` | APISIX Admin API | Docker overlay icinden sadece Dashboard erisir |
| `9090/tcp` | Prometheus | SWAG arkasinda `443` — IP kisitli |
| `3000/tcp` | Grafana | SWAG arkasinda `443` — IP kisitli |
DB subnet kaynakli (`iklim-db-*` node'lari Swarm'a worker olarak katildigi icin):
| Port | Servis | Kaynak |
| --- | --- | --- |
| `2377/tcp` | Docker Swarm control plane | `10.20.20.0/24` |
| `7946/tcp,udp` | Docker Swarm node discovery | `10.20.20.0/24` |
| `4789/udp` | Docker Swarm VXLAN overlay | `10.20.20.0/24` |
### DB Firewall — Private Ingress
Admin erisimi:
| Port | Servis | Kaynak |
| --- | --- | --- |
| `22/tcp` | SSH | `admin_allowed_cidrs` |
App subnet kaynakli (`10.20.10.0/24`):
| Port | Servis | Not |
| --- | --- | --- |
| `5432/tcp` | PostgreSQL (Patroni primary) | App subnet erisimi |
| `27017/tcp` | MongoDB replica set endpoint | App subnet erisimi |
| `2377/tcp` | Docker Swarm control plane | App subnet icinden |
| `7946/tcp,udp` | Docker Swarm node discovery | App subnet icinden |
| `4789/udp` | Docker Swarm VXLAN overlay | App subnet icinden |
DB subnet icindeki karsilikli erisim (`10.20.20.0/24`):
| Port | Servis | Not |
| --- | --- | --- |
| `5432/tcp` | PostgreSQL Patroni replication | DB node'lari arasi |
| `27017/tcp` | MongoDB replica set internal | DB node'lari arasi |
| `2379/tcp` | etcd client | Patroni → etcd erisimi |
| `2380/tcp` | etcd peer | etcd cluster internal |
| `8008/tcp` | Patroni REST API | Patroni leader election ve saglik kontrolu |
IP kisitlamasi Hetzner firewall'da degil, SWAG nginx konfigurasyonunda yapilir.
## Outputs
`terraform apply` veya `terraform output` sonrasi asagidaki degerler alinabilir:
| Output | Aciklama |
| --- | --- |
| `ansible_inventory_yaml` | Ansible inventory YAML — `ansible/inventory/generated/prod.yml` dosyasina yazilir |
| `prod_private_ips` | Tum node'larin private IP haritasi (`swarm` ve `db` alt anahtarlari) |
| `prod_public_ips` | Tum node'larin public IPv4 haritasi |
| `prod_floating_ip` | Swarm giris noktasi icin Floating IP adresi (DNS A kaydi bu IP'ye yonlendirilir) |
Ansible inventory cikarmak icin:
```bash
terraform output -raw ansible_inventory_yaml > \
../../ansible/inventory/generated/prod.yml
```
## Lifecycle ve Resize Politikasi
### server_type Degisikligi (Yeniden Boyutlandirma)
`server_type` degistirmek Terraform destroy+create **tetiklemez**. `hcloud` provider
bunu natively destekler: sunucuyu durdurur, Hetzner Resize API'sini cagirir,
yeniden baslatir. `terraform.tfvars` icinde degeri guncelle, `terraform apply` calistir.
Downtime olur (sunucu durur ve baslar) ancak disk, kurulu yazilim ve Docker volumes
korunur. `ignore_changes` veya manuel adim gerekmez.
### Hangi Degisiklikler Sunucuyu Zorla Yeniden Olusturur?
| Degisen alan | Davranis | Not |
| --- | --- | --- |
| `server_type` | In-place resize (provider native) | `terraform apply` yeterli |
| `hcloud_server_network` | Sadece attachment guncellenir | Ayri resource kullanildigi icin |
| `hcloud_firewall_attachment` | Sadece attachment guncellenir | Ayri resource kullanildigi icin |
| `placement_group_id` | Hetzner API degisime izin vermiyor → destroy+create | Degistirme |
| `image` | Disk imaji degisir → destroy+create | Degistirme |
| `location` | Baska datacenter'a tasinamaz → destroy+create | Degistirme |
### Network ve Firewall Attachment Ayrimi
`network` blogu ve `firewall_ids` `hcloud_server` icine gomulmez. Bunun yerine
ayri resource tanimlanir:
- `hcloud_server_network` — private IP atamasi (`for_each` ile her node icin)
- `hcloud_firewall_attachment` — firewall iliskisi (`for_each` ile turetilen server listesi)
### prevent_destroy Korumasi
Her sunucuya `lifecycle { prevent_destroy = true }` eklenir. Kasitli silmek icin
once lifecycle blogunu gecici olarak kaldir.
## Nasil Calistirilir
### Hazirlik
**1. tfvars olustur (bir kere):**
```bash
cd Environment_Infrastructure/terraform/hetzner/prod
cp terraform.tfvars.example terraform.tfvars
# terraform.tfvars icerigini gercek degerlerle doldur
# (hcloud_token, admin_allowed_cidrs, vb.)
```
`terraform.tfvars` commit edilmez — `.gitignore` ile korunur.
**2. Provider yukle (bir kere):**
```bash
terraform init
```
### Ilk Uygulama
```bash
# Nelerin olusacagini goster — bozma yapma
terraform plan
# Onayla ve olustur
terraform apply
```
`apply` sonrasi 6 sunucu, 2 firewall, 1 floating IP ve network kaynaklari Hetzner'da gorunur.
### Ansible Inventory Alma
```bash
terraform output -raw ansible_inventory_yaml > \
../../ansible/inventory/generated/prod.yml
```
### Gitea Değişkeni: PROD_FLOATING_IP
Deploy pipeline DNS kayıtlarını otomatik yönetmek için bu değişkene ihtiyaç duyar. `terraform apply` sonrasında bir kez ayarlanır:
```bash
terraform output prod_floating_ip
```
Çıkan IP adresini Gitea → proje ayarları**Variables** altında `PROD_FLOATING_IP` adıyla ekle. Pipeline `vars.PROD_FLOATING_IP` ile okur ve GoDaddy A kayıtlarını idempotent olarak günceller.
### Resize (Server Type Degistirme)
`terraform.tfvars` icinde `server_type_swarm` veya `server_type_db` degerini degistir:
```bash
terraform apply
```
Sunucu durdurulur, Hetzner Resize API cagirilir, yeniden baslatilir. Disk ve Docker volumes korunur. Downtime olur.
### Sunucu Silme (Zorla)
`prevent_destroy = true` oldugu icin normal `terraform destroy` hata verir. Once `servers.tf` icindeki `lifecycle` blogunu gecici kaldir:
```hcl
# lifecycle {
# prevent_destroy = true
# }
```
Sonra:
```bash
terraform destroy -target=hcloud_server.swarm["iklim-app-01"]
```
Islemi tamamladiktan sonra lifecycle blogunu geri ekle.
### State Yonetimi
Simdilik local state kullanilmaktadir (`terraform.tfstate`). State dosyasi repo'ya commit edilmez. Ekipte birden fazla kisi calisiyorsa Hetzner Object Storage veya HCP Terraform remote state kullanilmalidir.
## Kabul Kriterleri
- `terraform plan` sadece prod Hetzner Project token'i ile calisir.
- 6 server olusur (`iklim-app-01/02/03`, `iklim-db-01/02/03`).
- Swarm node'lari `iklim-prod-app-spread` placement group icindedir.
- DB node'lari `iklim-prod-db-spread` placement group icindedir.
- Public firewall sadece `22`, `80`, `443` ingress'e izin verir.
- Private firewall `01-private-network-port-matrisi.md` ile uyumludur.
- DB replication portlari yalnizca DB subnet'ten erisilebilir.
- Floating IP olusur ve `iklim-app-01`'e atanir.
- Terraform state ve secret tfvars commit edilmez.