- Anglicized setup and facts markdown file names for better consistency. - Updated 01-swarm-init-multinode.md to highlight Ansible automation of Swarm initialization and labeling. - Overhauled 03-infra-stack-changes.md to describe the single monolithic file strategy and reflect current Redis, RabbitMQ, and etcd cluster configurations. - Fixed minor overrides and typos in Patroni templates and Ansible bootstrap documents. - Restructured README and roadmap mapping to align with the renamed setup documents.
8.7 KiB
02 - Test Terraform IaC
The purpose of this phase is to create the minimum IaaS resources inside the test Hetzner Cloud Project with Terraform. This document is written so it can be applied on its own.
Scope
Terraform creates the following in the test environment:
- Private network:
iklim-test-net - Subnets:
- App/Swarm subnet:
10.10.10.0/24 - DB subnet:
10.10.20.0/24
- App/Swarm subnet:
- Firewall:
- Public ingress:
22/tcp,80/tcp,443/tcp, plus test DB WireGuard51820/udp - Private ingress: test rules in
01-private-network-port-matrix.md
- Public ingress:
- SSH key
- Placement group:
iklim-test-spread - Floating IP: stable IPv4 for the swarm entry point
- Server:
iklim-app-01iklim-db-01
- Ansible inventory output
Terraform does not install DB software. The DB node is prepared at the machine, network, and firewall level; Ansible later prepares Docker, Swarm worker membership, DB config directories, and WireGuard.
Recommended File Structure
terraform/
hetzner/
test/
versions.tf
providers.tf
variables.tf
locals.tf
network.tf
firewall.tf
placement.tf
servers.tf
floating_ip.tf
outputs.tf
terraform.tfvars.example
terraform.tfvars will not be committed. It must be ignored in .gitignore.
Variables
Minimum variables:
hcloud_token = "secret"
location = "fsn1"
image = "rocky-10"
server_type_app = "cpx42"
server_type_db = "cpx42"
admin_ssh_public_key_path = "~/.ssh/id_rsa.pub"
admin_allowed_cidrs = ["X.X.X.X/32"]
The environment constant is in locals.tf; it is not overridden with tfvars.
Start with a single location for location. Disaster recovery across different regions/locations is outside the scope of this stage and must be added to the document later.
The server type decision is based on the current test environment metrics in ../hetzner-sizing-report.md. Because 10 microservices and infrastructure services run together on the test app node, cpx32 was considered risky in terms of RAM. cpx42 is also recommended for the test DB node because of single-node CPU spike risk.
Server Roles
| Server | Private IP | Role |
|---|---|---|
iklim-app-01 |
10.10.10.11 |
Swarm manager + app worker + Gitea runner |
iklim-db-01 |
10.10.20.11 |
DB node / Swarm worker for DB services |
Private IPs must be statically defined inside Terraform. Ansible inventory and firewall rules remain deterministic.
Recommended Resources and Cost
| Server | Role | Server Type | CPU | RAM | SSD | Monthly |
|---|---|---|---|---|---|---|
iklim-app-01 |
Swarm manager + app worker + Gitea runner | cpx42 |
8 AMD | 16 GB | 320 GB | $29.99 |
iklim-db-01 |
PostgreSQL/PostGIS + MongoDB node | cpx42 |
8 AMD | 16 GB | 320 GB | $29.99 |
| Total | 2 servers | 16 vCPU | 32 GB | 640 GB | $59.98 |
Firewall Rules
Public ingress:
| Port | Source | Target |
|---|---|---|
22/tcp |
admin_allowed_cidrs |
All test nodes |
80/tcp |
0.0.0.0/0, ::/0 |
iklim-app-01 |
443/tcp |
0.0.0.0/0, ::/0 |
iklim-app-01 |
For public ingress, 8200/tcp, 5432/tcp, 27017/tcp, 5672/tcp, 15672/tcp, 6379/tcp, 2379/tcp, 9000/tcp, 9180/tcp, 9090/tcp, and 3000/tcp will not be opened. 51820/udp is the explicit test-only public exception for WireGuard.
App (swarm) Firewall — Private Ingress
Source from app subnet (iklim-app-01):
| Port | Service | Access method |
|---|---|---|
2377/tcp |
Docker Swarm control plane | From app subnet |
7946/tcp,udp |
Docker Swarm node discovery | From app subnet |
4789/udp |
Docker Swarm VXLAN overlay | From app subnet |
8200/tcp |
Vault | Docker overlay / private network |
6379/tcp |
Redis | From app subnet |
5672/tcp |
RabbitMQ AMQP | From app subnet |
61613/tcp |
RabbitMQ STOMP | From app subnet |
15674/tcp |
RabbitMQ Web STOMP | From app subnet |
15672/tcp |
RabbitMQ Management | From app subnet; external access through SWAG 443 — IP restricted |
9000/tcp |
APISIX Dashboard | From app subnet; external access through SWAG 443 — IP restricted |
9180/tcp |
APISIX Admin API | From app subnet, including Docker overlay |
9090/tcp |
Prometheus | From app subnet; external access through SWAG 443 — IP restricted |
3000/tcp |
Grafana | From app subnet; external access through SWAG 443 — IP restricted |
Source from DB subnet, because iklim-db-01 joins Swarm as a worker:
| Port | Service | Source |
|---|---|---|
2377/tcp |
Docker Swarm control plane | 10.10.20.0/24 |
7946/tcp,udp |
Docker Swarm node discovery | 10.10.20.0/24 |
4789/udp |
Docker Swarm VXLAN overlay | 10.10.20.0/24 |
DB Firewall — Private Ingress
| Port | Service | Source |
|---|---|---|
22/tcp |
SSH | admin_allowed_cidrs |
51820/udp |
WireGuard VPN | 0.0.0.0/0, ::/0 — authentication with cryptographic key |
5432/tcp |
PostgreSQL | 10.10.10.0/24 (app subnet) |
27017/tcp |
MongoDB | 10.10.10.0/24 (app subnet) |
2377/tcp |
Docker Swarm control plane | 10.10.10.0/24 (app subnet) |
7946/tcp,udp |
Docker Swarm node discovery | 10.10.10.0/24 (app subnet) |
4789/udp |
Docker Swarm VXLAN overlay | 10.10.10.0/24 (app subnet) |
IP restriction is done in the SWAG nginx configuration, not in the Hetzner firewall. None of these management ports are opened publicly from the admin_allowed_cidrs source.
For other private ingress rules, 01-private-network-port-matrix.md will be used as the source.
Placement Group
The iklim-test-spread placement group will be type = "spread". Because there are two servers in test, this group aims to distribute the iklim-app-01 and iklim-db-01 machines across different physical hosts.
Note: A spread placement group is not a guarantee of a different cabinet or location; it reduces the impact of a single physical host failure.
Terraform Output Expectations
outputs.tf must produce at least the following information:
output "ansible_inventory_yaml" {
sensitive = false
}
output "test_private_ips" {
sensitive = false
}
output "test_public_ips" {
sensitive = false
}
output "test_floating_ip" {
sensitive = false
}
The inventory output can later be written to ansible/inventory/generated/test.yml. If the inventory file contains no secrets, it can be committed; if it contains secrets or tokens, it will not be committed.
Lifecycle and Resize Policy
server_type Change (Resize)
Changing server_type does not trigger Terraform destroy+create. The hcloud provider supports this natively: it stops the server, calls the Hetzner Resize API, and starts it again. Update the value in terraform.tfvars and run terraform apply.
There is downtime, because the server stops and starts, but disk, installed software, and Docker volumes are preserved. No ignore_changes or manual step is required.
Which Changes Force Server Recreation?
| Changed field | Behavior | Note |
|---|---|---|
server_type |
In-place resize (provider native) | terraform apply is enough |
hcloud_server_network |
Only attachment is updated | Because a separate resource is used |
hcloud_firewall_attachment |
Only attachment is updated | Because a separate resource is used |
placement_group_id |
Hetzner API does not allow changing it -> destroy+create | Do not change |
image |
Disk image changes -> destroy+create | Do not change |
location |
Cannot be moved to another datacenter -> destroy+create | Do not change |
Network and Firewall Attachment Separation
The network block and firewall_ids are not embedded inside hcloud_server. Instead, separate resources are defined:
hcloud_server_network— private IP assignmenthcloud_firewall_attachment— firewall relationship
In embedded definitions, some provider versions interpret changes in these fields as server recreation. When separate resources are used, only the attachment is updated and the server is left untouched.
prevent_destroy Protection
Each server gets lifecycle { prevent_destroy = true }. While this block exists, Terraform cannot delete the server under any condition and fails during the plan phase. To intentionally delete a server, temporarily remove the lifecycle block first.
Acceptance Criteria
terraform planworks only with the test Hetzner Project token.- 2 servers are created after
terraform apply. - The two servers can reach each other through the private network.
- Only
22,80,443, and test WireGuard51820/udpare open at firewall level from the public internet. - Vault
8200remains closed from the public internet. - Terraform state is not committed to the repo.