16 Commits

Author SHA1 Message Date
27f4f83f73 docs(prod): resolve cross-layer inconsistencies and complete prod env implementation
Ansible roles:
- act_runner/defaults: set act_runner_name to inventory_hostname (was
  hardcoded to iklim-test-app); added vault_gitea_runner_token to vault.yml
- prod/group_vars/all: restructured from flat files to all/ directory;
  added act_runner_labels override (prod-runner,ubuntu-24.04,hostname);
  added storagebox_managed_directories; added swarm_manager_ip and other
  prod-specific vars
- prod/roles/db_stack: prod-specific db_node tasks using StorageBox paths
  (/mnt/storagebox/db/...) instead of local paths
- docker/tasks: split firewalld loop into all-nodes (Swarm ports) and
  app-only (80/443) tasks
- swarm/tasks: added --advertise-addr private_ip to join commands for
  correct multi-homed node advertisement
- hardening/tasks: corrected firewalld drop zone configuration
- node_dirs/tasks: added /opt/iklimco/vault/data for Vault Raft volume
- db_stack/tasks/app_node: updated stale comment (removed pg-proxy reference)
- db_stack/templates: removed pg-proxy and mongo-proxy service blocks
- test/host_vars/iklim-app-01: added act_runner_name override to preserve
  existing test runner registration

Roadmap and setup docs:
- roadmap/03-infra-stack-changes: added replicas:0 for etcd/postgresql/
  mongodb/pg-proxy/mongo-proxy in prod overlay; updated placement table;
  fixed grafana/data mkdir (auto-created by Ansible); translated Turkish
  note to English
- roadmap/08-deploy-pipeline-update: updated stale "remains idle" note
  for standalone etcd (now disabled with replicas:0)
- roadmap/01-swarm-init-multinode: consistency fixes
- setup/06: added Outputs section and etcd firewall port documentation
- setup/07: removed prometheus/data from StorageBox acceptance criteria;
  replaced manual StorageBox mkdir section with Ansible auto-creation note;
  updated prod README section with full bootstrap instructions and vault docs;
  added act_runner_labels prod policy
- setup/08: extensive rewrite — aligned with Patroni etcd overlay DNS,
  corrected hcloud_firewall.app reference, updated all StorageBox paths
  from /prod/db/ to /db/
- setup/09: removed prometheus/data from acceptance criteria; updated
  runner label policy (removed docker/swarm-manager labels); added
  acceptance criterion for disabled services absent from docker service ls

Terraform:
- prod/firewall.tf: added missing DB subnet mutual rules (etcd, Patroni)
- prod/outputs.tf: added prod_floating_ip and prod_private_ips outputs
- prod/servers.tf: aligned placement group and naming
- prod/variables.tf: corrected variable descriptions
- prod/terraform.tfvars.example: updated defaults
- terraform/hetzner/README.md: new comprehensive README covering both
  test and prod environments with firewall tables and inventory instructions

ansible/README.md: expanded prod section with inventory groups, bootstrap
  run order, runner label policy, and vault variable documentation
2026-05-18 19:17:56 +03:00
8780c7c05e docs(db): implement direct cluster access strategy for production
- Updated roadmap (03-infra-stack-changes.md) to deprecate database proxies in prod.
- Detailed direct subnet access via WireGuard for production developers.
- Provided multi-host connection parameters for Patroni and MongoDB Replica Sets in setup guide (08-prod-db-cluster-kurulum.md).
- Added environment comparison table to developer access guide.
2026-05-18 14:25:26 +03:00
4c3b7faad6 docs(roadmap): update production environment roadmap and setup guides
- Documented infrastructure changes for Redis Sentinel and RabbitMQ clustering.
- Updated setup guides for Terraform, Ansible, and Swarm node recovery.
- Clarified APISIX rate limit policy and degradation settings.
2026-05-17 18:54:44 +03:00
f4b7f49968 chore: prepare prod ansible and db operations
Add the Ansible README and expand prod bootstrap coverage for StorageBox keys, DB labels, DB stack configuration, and act runner setup. Update MongoDB configuration for replica set support and refresh prod roadmap/setup documentation for Swarm labels, StorageBox-backed cert paths, and recovery guidance.
2026-05-15 20:39:57 +03:00
49ea69d805 feat: provision precipitation storage directory
Create managed StorageBox directories from Ansible and document the precipitation image bind mount required by the test Swarm deployment.
2026-05-14 19:14:53 +03:00
3bf0a513f9 docs(setup): sync 01-05 documentation with actual terraform/ansible/workflow
- 01: Add WireGuard 51820/udp to public ingress table; add 9000/tcp
  (APISIX Dashboard) to admin CIDR row in test private rules
- 02: Fix admin_ssh_public_key_path (id_rsa.pub, not id_ed25519.pub);
  add WireGuard 51820/udp to DB firewall table; clarify 9000/9180 port
  descriptions (app subnet access + SWAG proxy)
- 03: Update file structure with new roles (db_stack, wireguard,
  act_runner) and playbooks (test-app/db-post-stack.yml); add floating
  IP systemd service to base role description; clarify node labels
- 04: Clarify two-phase deployment (Ansible prepares dirs/config,
  Gitea CI/CD deploys stack); add WireGuard setup info
- 05: Add system user column to runner table; fix runner name in
  acceptance criteria (iklim-test-app → test-runner)
2026-05-14 16:13:25 +03:00
f6fa947281 Remove iklim-db stack deploy; update Harbor push docs
- ansible: db_stack app_node ve test-db-post-stack'ten artık kullanılmayan stack deploy adımları kaldırıldı (DB servisleri iklimco stack'ine taşındı)
- setup/05: push-harbor-custom-images.sh artık config dosyalarını kendisi üretiyor, init-base.sh ön adımı kaldırıldı
2026-05-13 21:21:22 +03:00
5fe57ee108 Implement: Declarative act_runner configuration and Docker integration
Migrates `act_runner` configuration from shell-generated to an Ansible-templated `config.yaml`. This enables:
- Dynamic label provisioning, including `test-runner:docker://ubuntu:22.04`.
- Explicit configuration for joining the `iklimco-net` overlay network.
- Docker socket mounting for CI/CD jobs to interact with the Docker daemon.

Updates `setup/05-test-runner-ve-deploy-onkosullari.md` and other related documentation to reflect the new automated and integrated runner setup.
2026-05-12 19:49:24 +03:00
2198f932cd Implement: Gitea Actions runner, automated DB stack, and Turkish localization
*   Introduces an Ansible role for installing and registering `act_runner` for Gitea Actions.
*   Automates PostgreSQL and MongoDB deployment on Docker Swarm in the test environment, leveraging Docker named volumes for data persistence.
*   Translates core documentation, including `README.md` and `setup/04-test-db-docker-kurulum.md`, to Turkish.
*   Adds comprehensive documentation for firewall architecture (`facts/firewall.md`) and Docker Swarm node recovery (`facts/swarm-node-recovery.md`).
*   Enhances security hardening by ensuring `fail2ban` is enabled and streamlining admin SSH key management via Ansible.
*   Updates Ansible vault structure to support new secret variables and adds `.vault_pass` to `.gitignore`.
2026-05-12 18:34:24 +03:00
65443e81e7 Refine: Update documentation to accurately reflect Ansible assets and structure
This commit brings the `README.md` and Ansible setup guides (`03-test-ansible-bootstrap.md`, `07-prod-ansible-bootstrap.md`) in sync with the current state of the Ansible automation.

Key updates include:
- Acknowledging the presence of in-repository Ansible playbooks and shared roles.
- Correcting Ansible inventory output paths and Terraform output commands.
- Detailing the new `group_vars/all/{vars.yml,vault.yml}` structure.
- Updating Ansible prerequisites to include `passlib` for password hashing.
- Adding documentation for `iklim` system user creation, keyboard layout, and refined firewall rules.
- Removing outdated "Known Gaps" related to missing Ansible code.
2026-05-11 19:00:10 +03:00
a25fa37aec Refine Ansible bootstrapping documentation for simplified command execution
Adjusts documentation for test and production Ansible bootstrapping to leverage `ansible.cfg`. Commands are now run from specific environment directories (`ansible/test/` or `ansible/prod/`), eliminating the need to explicitly specify inventory and playbook paths.

Also adds an initial step to install required Ansible collections using `ansible-galaxy`.
2026-05-11 17:51:01 +03:00
bf8f011e43 Restructure setup documentation and refine environment bootstrapping
This commit introduces a reordered and renumbered set of setup documentation files to better reflect the deployment stages for both test and production environments.

Key changes include:
*   A new `setup-vs-roadmap-map.md` file to provide a clear mapping between roadmap tasks and their corresponding setup phases.
*   Significantly expanded Ansible bootstrap documentation for both test and production, detailing Docker, Swarm, security hardening, and StorageBox SSH key management roles.
*   Formalized database Docker and Swarm cluster setup instructions for test and production, including explicit steps for Swarm worker integration of DB nodes.
*   Updated roadmap documentation (`roadmap/prod-env/*`) to align with the refined setup, incorporating correct private IP addresses for Swarm joins, new node labels, and floating IP usage for GoDaddy DNS records.
2026-05-11 17:47:30 +03:00
b115a4cbdf Implement Hetzner sizing report recommendations and detailed DB setups
- Add `hetzner-sizing-report.md` defining data-driven server type recommendations for test and prod environments.
- Update Terraform configurations to align with the recommended `CPX` server types and refine firewall rules for Docker Swarm and database interactions.
- Introduce comprehensive documentation and stack files for:
    - Single-node PostgreSQL/MongoDB deployment on a test DB worker node.
    - High-availability 3-node MongoDB replica set and Patroni+etcd PostgreSQL cluster for production.
- Enhance Ansible bootstrap roles with SELinux disabling, fail2ban configuration, and StorageBox SSH key management for CI/CD.
- Reorganize and rename setup documentation files for improved structure and clarity.
2026-05-11 14:54:09 +03:00
720c79d460 Add Hetzner Cloud production infrastructure with multi-node support
- This commit introduces the Terraform configuration to provision a production environment on Hetzner Cloud, building on the existing test setup.
- Key improvements and new features include:
* **Multi-node clusters:** Scaling to 3-node Swarm application and database clusters for improved resilience.
* **High availability:** Utilizing a Hetzner Floating IP for the application entry point and `spread` placement groups for fault tolerance across physical hosts.
* **Enhanced network security:** Internal management services (RabbitMQ, APISIX, Prometheus, Grafana) are restricted to the application subnet, expected to be accessed via an internal reverse proxy (SWAG).
* **Internal database replication:** New firewall rules enable PostgreSQL replication and MongoDB replica set traffic within the database subnet.
* **Refined test environment:** Updates to align `test` configuration with the new `prod` structure, including a dedicated floating IP and adjusted firewall rules.
* **Configuration standardization:** Environment-specific details moved to `locals.tf` for clarity, with upgraded server types and migration to Rocky Linux as the base image.
- Updates were also made to the latest version of Terraform to ensure consistency in the documentation
2026-05-10 15:43:22 +03:00
2d515f7206 Add initial Terraform infrastructure for Hetzner test environment
This commit introduces the foundational Infrastructure-as-Code for provisioning a test environment on Hetzner Cloud. It defines server nodes, private networking, comprehensive firewalls, and includes documentation on resource lifecycle and safe configuration practices.
2026-05-10 14:09:23 +03:00
81c38e8d39 initial commit 2026-05-09 16:26:06 +03:00