43 Commits

Author SHA1 Message Date
f23835a30a docs(config): Update template file paths to use 'template/' subdirectory
Reflects a clearer organization for SWAG configuration templates across all roadmap and setup documentation. This standardizes references to template files by explicitly including the `template/` subdirectory, improving clarity and distinction from generated configuration files.
2026-05-23 14:43:04 +03:00
ff9837ec54 feat(infra): update environment infrastructure configurations
- Synchronized environment-specific settings with the new isolated architecture.
- Updated network and storage definitions to match the latest Swarm stack requirements.
- Harmonized configuration templates for consistent cross-environment deployment.
2026-05-22 21:40:21 +03:00
c568e31515 Finalize production database bootstrap automation
Add DB-specific StorageBox ownership variables and make the davfs mount role honor configurable uid and gid values so database containers can access mounted files.

Extend the prod DB node role to sync StorageBox writes, generate and distribute the MongoDB replica set keyfile, wait for the keyfile on each node, and enforce keyfile permissions.

Tune MongoDB and Patroni templates for quieter logging, correct secret variable names, local bootstrap trust, and production network pg_hba coverage.

Refresh the production setup history with the current bootstrap sequence, DB stack deployment workflow, MongoDB replica set initialization, Patroni validation, and completed DB cluster status.
2026-05-21 21:48:11 +03:00
e3787d80f6 docs(infra): align DB stack and APISIX production guidance
Update Environment_Infrastructure to match the current root stack conventions for database images, shared secret names, and APISIX real IP handling.

- update test Ansible DB image defaults to PostGIS 18/PostGIS 3.6 and MongoDB 8.3.2

- align Patroni configuration with DATABASE_POSTGRES_* secret variable names

- document APISIX real IP template configuration and Harbor rebuild workflow

- replace the separate DB stack env file guidance with the shared .env.secrets.shared flow

- update production setup and roadmap snippets to use current PostGIS, MongoDB, and APISIX rebuild commands
2026-05-20 19:55:49 +03:00
9e20f2fcf8 chore(prod): capture production bootstrap access configuration
Document and commit the production bootstrap state after the initial Hetzner and Ansible rollout.

- switch Ansible prod runbooks to use the shared vault password file

- record production admin CIDRs, SSH key path, encrypted group vault, and encrypted per-host vault files

- add generated production inventory and the prod setup history notes from the first bootstrap

- keep root password login disabled while preserving key-based root access for Ansible bootstrap continuity

- document separate Hetzner projects and tokens for test/prod and commit the prod provider lock file

- remove the private Redis firewall allowance from the prod Terraform firewall and matching setup docs
2026-05-19 17:49:59 +03:00
17be81a66e feat(db): align WireGuard DB access with standard ports
- switch WireGuard DB access defaults from proxy ports to 5432/27017

- remove obsolete db stack template for proxy-based DB access

- clean roadmap wording around deprecated DB proxy services
2026-05-19 17:47:23 +03:00
27f4f83f73 docs(prod): resolve cross-layer inconsistencies and complete prod env implementation
Ansible roles:
- act_runner/defaults: set act_runner_name to inventory_hostname (was
  hardcoded to iklim-test-app); added vault_gitea_runner_token to vault.yml
- prod/group_vars/all: restructured from flat files to all/ directory;
  added act_runner_labels override (prod-runner,ubuntu-24.04,hostname);
  added storagebox_managed_directories; added swarm_manager_ip and other
  prod-specific vars
- prod/roles/db_stack: prod-specific db_node tasks using StorageBox paths
  (/mnt/storagebox/db/...) instead of local paths
- docker/tasks: split firewalld loop into all-nodes (Swarm ports) and
  app-only (80/443) tasks
- swarm/tasks: added --advertise-addr private_ip to join commands for
  correct multi-homed node advertisement
- hardening/tasks: corrected firewalld drop zone configuration
- node_dirs/tasks: added /opt/iklimco/vault/data for Vault Raft volume
- db_stack/tasks/app_node: updated stale comment (removed pg-proxy reference)
- db_stack/templates: removed pg-proxy and mongo-proxy service blocks
- test/host_vars/iklim-app-01: added act_runner_name override to preserve
  existing test runner registration

Roadmap and setup docs:
- roadmap/03-infra-stack-changes: added replicas:0 for etcd/postgresql/
  mongodb/pg-proxy/mongo-proxy in prod overlay; updated placement table;
  fixed grafana/data mkdir (auto-created by Ansible); translated Turkish
  note to English
- roadmap/08-deploy-pipeline-update: updated stale "remains idle" note
  for standalone etcd (now disabled with replicas:0)
- roadmap/01-swarm-init-multinode: consistency fixes
- setup/06: added Outputs section and etcd firewall port documentation
- setup/07: removed prometheus/data from StorageBox acceptance criteria;
  replaced manual StorageBox mkdir section with Ansible auto-creation note;
  updated prod README section with full bootstrap instructions and vault docs;
  added act_runner_labels prod policy
- setup/08: extensive rewrite — aligned with Patroni etcd overlay DNS,
  corrected hcloud_firewall.app reference, updated all StorageBox paths
  from /prod/db/ to /db/
- setup/09: removed prometheus/data from acceptance criteria; updated
  runner label policy (removed docker/swarm-manager labels); added
  acceptance criterion for disabled services absent from docker service ls

Terraform:
- prod/firewall.tf: added missing DB subnet mutual rules (etcd, Patroni)
- prod/outputs.tf: added prod_floating_ip and prod_private_ips outputs
- prod/servers.tf: aligned placement group and naming
- prod/variables.tf: corrected variable descriptions
- prod/terraform.tfvars.example: updated defaults
- terraform/hetzner/README.md: new comprehensive README covering both
  test and prod environments with firewall tables and inventory instructions

ansible/README.md: expanded prod section with inventory groups, bootstrap
  run order, runner label policy, and vault variable documentation
2026-05-18 19:17:56 +03:00
8780c7c05e docs(db): implement direct cluster access strategy for production
- Updated roadmap (03-infra-stack-changes.md) to deprecate database proxies in prod.
- Detailed direct subnet access via WireGuard for production developers.
- Provided multi-host connection parameters for Patroni and MongoDB Replica Sets in setup guide (08-prod-db-cluster-kurulum.md).
- Added environment comparison table to developer access guide.
2026-05-18 14:25:26 +03:00
2202e92c2c docs(roadmap): refactor large code block into smaller snippets
- Split single large bash block into four logical sections (Constants, WS, Auth, Global).
- Improved Obsidian rendering compatibility by removing inline explanatory comments from code blocks.
- Enhanced readability with bold subsection headers.
2026-05-18 10:48:09 +03:00
fb80388b97 docs(roadmap): fix bash syntax in rate limit redis snippet
- Removed unnecessary backslashes from RATE_REDIS string definition.
- Optimized quote structure for valid JSON interpolation.
2026-05-18 10:46:18 +03:00
b230d1575e docs(roadmap): include complete rate and connection limit constants
- Expanded the header constants example to cover Global, Auth, and WebSocket limits.
- Provided detailed snippets for using constants in WS_PLUGINS and AUTH_LIMIT blocks.
- Reinforced maintainability standards for the APISIX initialization script.
2026-05-18 10:34:08 +03:00
0302b2e22a docs(roadmap): standardize rate limit configuration with constants
- Recommended defining rate limit thresholds as header constants in init.sh.
- Updated documentation snippet to use GLOBAL_LIMIT_COUNT and GLOBAL_LIMIT_WINDOW variables.
- Improved maintainability for future rate limit adjustments.
2026-05-18 10:23:57 +03:00
507e3a11a1 docs(roadmap): add rabbitmq network aliases for consistent hashing
- Configured 'iklimco-net' aliases for RabbitMQ nodes in prod overlay documentation.
- Updated Step 6 and Step 8 stack snippets to include network aliases and definitions.
- Added a technical note to Step 7 explaining DNS requirements for sticky sessions.
2026-05-18 10:22:14 +03:00
52bd6a59ac docs(roadmap): add sticky session plan for rabbitmq websocket upstream
- Implemented Consistent Hashing (chash) logic in APISIX upstream configuration for prod.
- Added instructions for real IP detection in APISIX configuration template.
- Documented the bypass of Swarm VIP for better session persistence on RabbitMQ nodes.
2026-05-18 09:46:51 +03:00
4c3b7faad6 docs(roadmap): update production environment roadmap and setup guides
- Documented infrastructure changes for Redis Sentinel and RabbitMQ clustering.
- Updated setup guides for Terraform, Ansible, and Swarm node recovery.
- Clarified APISIX rate limit policy and degradation settings.
2026-05-17 18:54:44 +03:00
fd6a0b4f46 docs: fix roadmap inconsistencies between test-env and prod-env
Corrects six documentation files to match the actual deployed pipeline
behavior and align test/prod approaches where they share the same code.

prod-env/02-godaddy-credentials.md
- Step 1: correct secret file from .env.secrets.shared to .env.secrets.swag;
  add clarifying note that .env.secrets.shared holds AppRole/DB secrets
  and must not be used for GoDaddy credentials.
- Step 4: document that GoDaddy A records are now managed automatically
  by the pipeline's 'Update DNS Records' step via the GoDaddy API;
  reference the Gitea variable PROD_FLOATING_IP that must be set once.

prod-env/08-deploy-pipeline-update.md
- Add Step 2 documenting the new 'Update DNS Records' pipeline step
  (GoDaddy API, idempotent check-before-update, requires jq and
  vars.PROD_FLOATING_IP).
- Renumber subsequent steps 3-8 to accommodate the new step.
- Fix DB hostnames in Step 7 (Run Database Init Scripts) from
  iklimco_postgresql/iklimco_mongodb to postgresql/mongodb, matching
  how Swarm overlay DNS resolves service names inside iklimco-net.
- Update context block: correct DB hostname description, replace
  outdated storagebox path note with env-var approach, list new steps.
- Update final step order to 24 steps including the DNS step and
  Release Deploy Lock; mark Wait for etcd as NEW.

prod-env/09-verify.md
- Insert check #2 for the precipitation image directory
  (/mnt/storagebox/precipitation/images) and iklimco_image-data volume
  bind mount, mirroring the equivalent check in test-env/08-verify.md.
- Renumber all subsequent checks (3-12) to maintain sequential ordering.

test-env/03-infra-stack-changes.md
- Update SWAG service volume snippet: replace hardcoded paths
  (swag-vl:/config, /opt/iklimco/swag/dns-conf, /opt/iklimco/swag/site-confs)
  with env-var forms (${SWAG_CONFIG_DIR:-swag-vl}, ${SWAG_DNS_CONF_DIR:-...},
  ${SWAG_SITE_CONFS_DIR:-...}) to match docker-stack-infra.yml.
- Update cert-reloader volume snippet: replace swag-vl and /opt/iklimco/ssl
  with ${SWAG_CONFIG_DIR:-swag-vl} and ${SWAG_CERT_DIR:-/opt/iklimco/ssl},
  enabling StorageBox override in prod without changing the base file.

test-env/04-swag-nginx-configs.md
- Replace RESTRICTED_IP_1/RESTRICTED_IP_2 individual env vars with
  RESTRICTED_IPS (comma-separated CIDR list) in the required-vars section,
  matching env-test/.env and the actual pipeline.
- Update all three IP-restricted template examples (apigw, rabbitmq,
  grafana) from allow ${RESTRICTED_IP_1}; allow ${RESTRICTED_IP_2}; to
  ${RESTRICTED_IPS_BLOCK}, matching the actual .conf.tpl files in the repo.
- Rewrite the deploy step section to match the real pipeline: docker run
  alpine for file writing, RESTRICTED_IPS_BLOCK generation via sed, and
  envsubst with explicit SWAG_VARS filter to protect nginx $upstream_* vars.

test-env/07-deploy-pipeline-update.md
- Step 2 (Prepare SWAG Directories): replace sudo-tee approach with the
  actual docker-run-alpine method used in deploy-test.yml; add nginx
  reload block; update notes to reflect RESTRICTED_IPS_BLOCK generation.
- Step 4 (Re-order): correct step numbering to match actual pipeline
  (21 steps); mark 'Wait for etcd' as already present in pipeline rather
  than a new addition; add Bootstrap Vault TLS Placeholder which was
  missing from the documented order.
2026-05-16 16:52:48 +03:00
5ddba7eba4 docs: update production roadmap for HA Vault and shared storage
- Refactor production setup documentation to reflect a 3-node Vault Raft cluster starting from launch.
- Update all paths to use StorageBox mounts for shared state (SWAG config, TLS certs, Monitoring data).
- Switch Nginx configuration convention from proxy-confs to site-confs to align with SWAG's auto-include behavior.
- Standardize TLS private key extensions to .pem.
- Update node failover and recovery facts to include monitoring services.
- Align deployment pipeline instructions with the latest environment variable-driven approach.
2026-05-16 16:18:21 +03:00
f4b7f49968 chore: prepare prod ansible and db operations
Add the Ansible README and expand prod bootstrap coverage for StorageBox keys, DB labels, DB stack configuration, and act runner setup. Update MongoDB configuration for replica set support and refresh prod roadmap/setup documentation for Swarm labels, StorageBox-backed cert paths, and recovery guidance.
2026-05-15 20:39:57 +03:00
49ea69d805 feat: provision precipitation storage directory
Create managed StorageBox directories from Ansible and document the precipitation image bind mount required by the test Swarm deployment.
2026-05-14 19:14:53 +03:00
bf64c2964c docs: update firewall facts and roadmap mapping
- Include missing WireGuard port (51820/udp) in firewall documentation.
- Synchronize PROD DB firewall rules with the latest Patroni/Swarm setup requirements.
- Complete the PROD section of setup-vs-roadmap-map.md to cover all transition steps.
- Clarify that infra services (Vault, RabbitMQ, etc.) are restricted to private/overlay networks.
2026-05-14 16:26:05 +03:00
3bf0a513f9 docs(setup): sync 01-05 documentation with actual terraform/ansible/workflow
- 01: Add WireGuard 51820/udp to public ingress table; add 9000/tcp
  (APISIX Dashboard) to admin CIDR row in test private rules
- 02: Fix admin_ssh_public_key_path (id_rsa.pub, not id_ed25519.pub);
  add WireGuard 51820/udp to DB firewall table; clarify 9000/9180 port
  descriptions (app subnet access + SWAG proxy)
- 03: Update file structure with new roles (db_stack, wireguard,
  act_runner) and playbooks (test-app/db-post-stack.yml); add floating
  IP systemd service to base role description; clarify node labels
- 04: Clarify two-phase deployment (Ansible prepares dirs/config,
  Gitea CI/CD deploys stack); add WireGuard setup info
- 05: Add system user column to runner table; fix runner name in
  acceptance criteria (iklim-test-app → test-runner)
2026-05-14 16:13:25 +03:00
39ffd4a33b feat(ansible/base): configure Hetzner floating IP via systemd service
Add hetzner-floating-ip.service systemd unit to base role so that
the floating IP is bound to eth0 on every boot. The task is
conditional (runs only when hetzner_floating_ip is defined in
host_vars). Add 49.12.116.113 as the floating IP for iklim-app-01
in test host_vars.
2026-05-14 16:13:24 +03:00
f150d93161 fix(firewall): open ports 80 and 443 in firewalld drop zone
The docker role only opened Swarm ports (2377, 7946, 4789).
HTTP and HTTPS were missing, making SWAG unreachable from outside.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 10:57:28 +03:00
9d7da80ffb decreased log verbosity 2026-05-14 00:01:58 +03:00
f6fa947281 Remove iklim-db stack deploy; update Harbor push docs
- ansible: db_stack app_node ve test-db-post-stack'ten artık kullanılmayan stack deploy adımları kaldırıldı (DB servisleri iklimco stack'ine taşındı)
- setup/05: push-harbor-custom-images.sh artık config dosyalarını kendisi üretiyor, init-base.sh ön adımı kaldırıldı
2026-05-13 21:21:22 +03:00
a9fc0b1234 feat(vpn): add WireGuard and DB proxy services for secure management
- Add new Ansible role `wireguard` to set up WireGuard VPN server on
  DB node with key generation, firewalld rules, and client peer config.
- Introduce `pg-proxy` and `mongo-proxy` socat containers in db_stack
  to expose PostgreSQL (15432) and MongoDB (17017) on host ports,
  restricted to WireGuard subnet via firewalld.
- Update test environment group_vars with WireGuard client entry for
  `murat-inspiron-15-3525`.
- Modify act_runner config: set `docker_host` to unix socket, remove
  explicit socket mount from options, and change runner label image to
  `catthehacker/ubuntu:act-22.04`.
- Open UDP port 51820 in Hetzner firewall for WireGuard inbound.
- Adjust test-db-post-stack playbook to include wireguard role (tagged).
- Update roadmap document with APISIX init step order.
2026-05-13 18:51:04 +03:00
ed51b6eedd feat(vpn): add WireGuard and DB proxy services for secure management
- Add new Ansible role `wireguard` to set up WireGuard VPN server on
  DB node with key generation, firewalld rules, and client peer config.
- Introduce `pg-proxy` and `mongo-proxy` socat containers in db_stack
  to expose PostgreSQL (15432) and MongoDB (17017) on host ports,
  restricted to WireGuard subnet via firewalld.
- Update test environment group_vars with WireGuard client entry for
  `murat-inspiron-15-3525`.
- Modify act_runner config: set `docker_host` to unix socket, remove
  explicit socket mount from options, and change runner label image to
  `catthehacker/ubuntu:act-22.04`.
- Open UDP port 51820 in Hetzner firewall for WireGuard inbound.
- Adjust test-db-post-stack playbook to include wireguard role (tagged).
- Update roadmap document with APISIX init step order.
2026-05-13 18:50:07 +03:00
5fe57ee108 Implement: Declarative act_runner configuration and Docker integration
Migrates `act_runner` configuration from shell-generated to an Ansible-templated `config.yaml`. This enables:
- Dynamic label provisioning, including `test-runner:docker://ubuntu:22.04`.
- Explicit configuration for joining the `iklimco-net` overlay network.
- Docker socket mounting for CI/CD jobs to interact with the Docker daemon.

Updates `setup/05-test-runner-ve-deploy-onkosullari.md` and other related documentation to reflect the new automated and integrated runner setup.
2026-05-12 19:49:24 +03:00
2198f932cd Implement: Gitea Actions runner, automated DB stack, and Turkish localization
*   Introduces an Ansible role for installing and registering `act_runner` for Gitea Actions.
*   Automates PostgreSQL and MongoDB deployment on Docker Swarm in the test environment, leveraging Docker named volumes for data persistence.
*   Translates core documentation, including `README.md` and `setup/04-test-db-docker-kurulum.md`, to Turkish.
*   Adds comprehensive documentation for firewall architecture (`facts/firewall.md`) and Docker Swarm node recovery (`facts/swarm-node-recovery.md`).
*   Enhances security hardening by ensuring `fail2ban` is enabled and streamlining admin SSH key management via Ansible.
*   Updates Ansible vault structure to support new secret variables and adds `.vault_pass` to `.gitignore`.
2026-05-12 18:34:24 +03:00
bbeaf97815 Implement: Administrative user, keyboard layout, and Ansible variable refactor
This commit introduces several core configurations and structural improvements:

*   **User Management:** Creates a new `iklim` administrative user with a securely hashed password, enabled by `python3-passlib`.
*   **System Configuration:** Sets the system keyboard layout to Turkish Q (`trq`).
*   **Security Hardening:** Refines firewall rules for SSH using a rich rule and ensures `journald` log limits file creation.
*   **Ansible Variable Management:** Restructures `group_vars` by consolidating global variables into `group_vars/all/vars.yml` and sensitive data into a dedicated `group_vars/all/vault.yml`.
*   **Ansible Compatibility:** Adds `!unsafe` to a `docker info` shell command to prevent future warnings.
2026-05-11 19:00:31 +03:00
65443e81e7 Refine: Update documentation to accurately reflect Ansible assets and structure
This commit brings the `README.md` and Ansible setup guides (`03-test-ansible-bootstrap.md`, `07-prod-ansible-bootstrap.md`) in sync with the current state of the Ansible automation.

Key updates include:
- Acknowledging the presence of in-repository Ansible playbooks and shared roles.
- Correcting Ansible inventory output paths and Terraform output commands.
- Detailing the new `group_vars/all/{vars.yml,vault.yml}` structure.
- Updating Ansible prerequisites to include `passlib` for password hashing.
- Adding documentation for `iklim` system user creation, keyboard layout, and refined firewall rules.
- Removing outdated "Known Gaps" related to missing Ansible code.
2026-05-11 19:00:10 +03:00
f73504c0f2 Implement: Initial Ansible environment bootstrapping and core roles
This commit introduces the foundational Ansible playbooks, roles, and configurations for automated provisioning of both production and test environments.

Key capabilities include:
-   **Base System Setup:** Common packages, timezone, chrony, and hostname.
-   **Security Hardening:** SELinux disable, SSH configuration, `dnf-automatic`, `fail2ban`, `firewalld` setup, and `journald` log limits.
-   **Docker & Swarm:** Docker installation and configuration, Docker Swarm initialization/joining for managers and workers, overlay network creation, and node labeling.
-   **Storage:** Hetzner StorageBox integration using `davfs2`.
-   **Directory Structure:** Creation of application and database-specific directories.

This establishes a comprehensive, automated pipeline for infrastructure deployment and initial configuration.
2026-05-11 17:51:43 +03:00
58b6fdc605 Refactor: Rename swarm infrastructure components to app
This commit systematically updates all Terraform configurations, including resources, variables, and labels, to use the more generic `app` designation instead of `swarm`. This improves consistency and decouples the infrastructure naming from a specific container orchestration technology like Docker Swarm.
2026-05-11 17:51:25 +03:00
a25fa37aec Refine Ansible bootstrapping documentation for simplified command execution
Adjusts documentation for test and production Ansible bootstrapping to leverage `ansible.cfg`. Commands are now run from specific environment directories (`ansible/test/` or `ansible/prod/`), eliminating the need to explicitly specify inventory and playbook paths.

Also adds an initial step to install required Ansible collections using `ansible-galaxy`.
2026-05-11 17:51:01 +03:00
bf8f011e43 Restructure setup documentation and refine environment bootstrapping
This commit introduces a reordered and renumbered set of setup documentation files to better reflect the deployment stages for both test and production environments.

Key changes include:
*   A new `setup-vs-roadmap-map.md` file to provide a clear mapping between roadmap tasks and their corresponding setup phases.
*   Significantly expanded Ansible bootstrap documentation for both test and production, detailing Docker, Swarm, security hardening, and StorageBox SSH key management roles.
*   Formalized database Docker and Swarm cluster setup instructions for test and production, including explicit steps for Swarm worker integration of DB nodes.
*   Updated roadmap documentation (`roadmap/prod-env/*`) to align with the refined setup, incorporating correct private IP addresses for Swarm joins, new node labels, and floating IP usage for GoDaddy DNS records.
2026-05-11 17:47:30 +03:00
42dfe76487 Introduce comprehensive README for environment infrastructure
This new README serves as the central documentation for the `iklim.co` Hetzner Cloud environment infrastructure repository. It outlines the project's purpose, scope, and structure, alongside detailed setup guides, security baselines, environment topologies, and usage instructions for Terraform.
2026-05-11 14:55:03 +03:00
03ad812512 Refine Hetzner firewall rules and update server types
Overhaul and expand firewall definitions for both `prod` and `test` environments to enable comprehensive inter-subnet communication.

This includes implementing explicit rules supporting:
- Docker Swarm overlay networks between application and database subnets.
- High-availability database clusters (PostgreSQL replication, MongoDB replica sets, Patroni, etcd).
- Internal access for various infrastructure services (Vault, Redis, RabbitMQ, APISIX, Prometheus, Grafana).

All firewall rule descriptions are standardized in English for improved clarity and consistency.

Additionally, update default `server_type_swarm` and `server_type_db` variables to the recommended `CPX` series for both environments. An initial generated Ansible inventory for the test environment is also added.
2026-05-11 14:54:46 +03:00
b115a4cbdf Implement Hetzner sizing report recommendations and detailed DB setups
- Add `hetzner-sizing-report.md` defining data-driven server type recommendations for test and prod environments.
- Update Terraform configurations to align with the recommended `CPX` server types and refine firewall rules for Docker Swarm and database interactions.
- Introduce comprehensive documentation and stack files for:
    - Single-node PostgreSQL/MongoDB deployment on a test DB worker node.
    - High-availability 3-node MongoDB replica set and Patroni+etcd PostgreSQL cluster for production.
- Enhance Ansible bootstrap roles with SELinux disabling, fail2ban configuration, and StorageBox SSH key management for CI/CD.
- Reorganize and rename setup documentation files for improved structure and clarity.
2026-05-11 14:54:09 +03:00
76f87aa2f9 Integrate DB nodes into Swarm and refine prod service deployment
- Database nodes now join the Docker Swarm as workers with `role=db` labels, allowing Swarm to manage their dedicated services.
- The `docker-stack-infra.yml` has been updated for production to focus solely on application-level infrastructure components.
- Dedicated database services (PostgreSQL, MongoDB, Patroni-etcd) are now explicitly deployed in separate Swarm stacks on `iklim-db-XX` nodes.
- Standardizes node naming conventions (`iklim-app-XX`, `iklim-db-XX`) across the production roadmap documentation.
- Clarifies that the `etcd` service within `docker-stack-infra.yml` is exclusively for APISIX configuration, distinct from Patroni's etcd cluster.
2026-05-11 14:53:21 +03:00
720c79d460 Add Hetzner Cloud production infrastructure with multi-node support
- This commit introduces the Terraform configuration to provision a production environment on Hetzner Cloud, building on the existing test setup.
- Key improvements and new features include:
* **Multi-node clusters:** Scaling to 3-node Swarm application and database clusters for improved resilience.
* **High availability:** Utilizing a Hetzner Floating IP for the application entry point and `spread` placement groups for fault tolerance across physical hosts.
* **Enhanced network security:** Internal management services (RabbitMQ, APISIX, Prometheus, Grafana) are restricted to the application subnet, expected to be accessed via an internal reverse proxy (SWAG).
* **Internal database replication:** New firewall rules enable PostgreSQL replication and MongoDB replica set traffic within the database subnet.
* **Refined test environment:** Updates to align `test` configuration with the new `prod` structure, including a dedicated floating IP and adjusted firewall rules.
* **Configuration standardization:** Environment-specific details moved to `locals.tf` for clarity, with upgraded server types and migration to Rocky Linux as the base image.
- Updates were also made to the latest version of Terraform to ensure consistency in the documentation
2026-05-10 15:43:22 +03:00
2d515f7206 Add initial Terraform infrastructure for Hetzner test environment
This commit introduces the foundational Infrastructure-as-Code for provisioning a test environment on Hetzner Cloud. It defines server nodes, private networking, comprehensive firewalls, and includes documentation on resource lifecycle and safe configuration practices.
2026-05-10 14:09:23 +03:00
81c38e8d39 initial commit 2026-05-09 16:26:06 +03:00
93412c9868 first commit 2026-05-09 15:28:03 +03:00