23 Commits

Author SHA1 Message Date
51933afea6 feat(infra): Refactor Swarm networking for reliable DNS and stack ownership
Moves `iklimco-net` overlay network creation to be managed by the Docker Swarm stack, ensuring reliable embedded DNS resolution for inter-service communication. This resolves issues where services on external overlay networks failed to discover each other via Docker DNS.

This refactoring includes:
*   Removing the manual `iklimco-net` creation from the Ansible `swarm` role.
*   Adjusting `act_runner` configuration to connect job containers to `iklimco-net` only after the stack has deployed and created the network.
*   Setting `storagebox_file_mode` to `0600` for DB nodes to prevent "too open" errors with MongoDB keyfiles.
*   Provisioning dedicated bind mount directories for MongoDB and PostgreSQL on DB nodes with correct ownership and permissions.
*   Updating documentation to reflect the consolidated stack and network changes.
2026-05-26 01:08:12 +03:00
6798426841 feat(infra): Implement multi-user admin SSH key management
Centralize and manage multiple administrator SSH public keys for server access and streamline administrative tasks.

This change:
- Allows provisioning of multiple admin SSH keys to the `iklim` user for human access.
- Adds the same admin SSH keys to the `root` user for emergency or bootstrap scenarios.
- Grants the `iklim` user passwordless sudo privileges to simplify administrative operations.
- Replaces the single `admin_ssh_public_key_path` variable with a list of keys, accommodating multiple administrators.
2026-05-24 21:01:54 +03:00
28ce381059 add murat home ip to server firewalls 2026-05-24 19:24:36 +03:00
3641f1a87e feat(infra): Improve StorageBox mounting reliability and directory management
Refactor StorageBox mount logic for greater stability and consistent remounts by utilizing shell commands. Enable `user_allow_other` for davfs2 mounts in `/etc/fuse.conf` and `fstab`, ensuring non-root container access to mounted files.

Standardize SWAG configuration directory provisioning to include specific subdirectories for DNS, proxy, and Certbot files. Streamline local `/opt/iklimco` directory creation on app and db nodes, removing obsolete paths and consolidating relevant service directories.
2026-05-24 16:27:00 +03:00
6f9d0d1588 feat(infra): Standardize StorageBox permissions and refactor DB stack name
- Ensure consistent directory and file permissions on StorageBox mounts for improved container access across application and database services.
- Introduce application-specific `storagebox_uid`/`gid` variables for more granular ownership control.
- Enhance StorageBox mount reliability by adding systemd reload and remount handlers for configuration changes.
- Add root credentials to Patroni's etcd configuration for authenticated communication.
- Update all relevant documentation and deployment scripts to use the `iklimco` Docker stack name for database services.
- Re-encrypt production vault secrets to include the new etcd password.
2026-05-23 18:11:01 +03:00
ff9837ec54 feat(infra): update environment infrastructure configurations
- Synchronized environment-specific settings with the new isolated architecture.
- Updated network and storage definitions to match the latest Swarm stack requirements.
- Harmonized configuration templates for consistent cross-environment deployment.
2026-05-22 21:40:21 +03:00
c568e31515 Finalize production database bootstrap automation
Add DB-specific StorageBox ownership variables and make the davfs mount role honor configurable uid and gid values so database containers can access mounted files.

Extend the prod DB node role to sync StorageBox writes, generate and distribute the MongoDB replica set keyfile, wait for the keyfile on each node, and enforce keyfile permissions.

Tune MongoDB and Patroni templates for quieter logging, correct secret variable names, local bootstrap trust, and production network pg_hba coverage.

Refresh the production setup history with the current bootstrap sequence, DB stack deployment workflow, MongoDB replica set initialization, Patroni validation, and completed DB cluster status.
2026-05-21 21:48:11 +03:00
e3787d80f6 docs(infra): align DB stack and APISIX production guidance
Update Environment_Infrastructure to match the current root stack conventions for database images, shared secret names, and APISIX real IP handling.

- update test Ansible DB image defaults to PostGIS 18/PostGIS 3.6 and MongoDB 8.3.2

- align Patroni configuration with DATABASE_POSTGRES_* secret variable names

- document APISIX real IP template configuration and Harbor rebuild workflow

- replace the separate DB stack env file guidance with the shared .env.secrets.shared flow

- update production setup and roadmap snippets to use current PostGIS, MongoDB, and APISIX rebuild commands
2026-05-20 19:55:49 +03:00
9e20f2fcf8 chore(prod): capture production bootstrap access configuration
Document and commit the production bootstrap state after the initial Hetzner and Ansible rollout.

- switch Ansible prod runbooks to use the shared vault password file

- record production admin CIDRs, SSH key path, encrypted group vault, and encrypted per-host vault files

- add generated production inventory and the prod setup history notes from the first bootstrap

- keep root password login disabled while preserving key-based root access for Ansible bootstrap continuity

- document separate Hetzner projects and tokens for test/prod and commit the prod provider lock file

- remove the private Redis firewall allowance from the prod Terraform firewall and matching setup docs
2026-05-19 17:49:59 +03:00
17be81a66e feat(db): align WireGuard DB access with standard ports
- switch WireGuard DB access defaults from proxy ports to 5432/27017

- remove obsolete db stack template for proxy-based DB access

- clean roadmap wording around deprecated DB proxy services
2026-05-19 17:47:23 +03:00
27f4f83f73 docs(prod): resolve cross-layer inconsistencies and complete prod env implementation
Ansible roles:
- act_runner/defaults: set act_runner_name to inventory_hostname (was
  hardcoded to iklim-test-app); added vault_gitea_runner_token to vault.yml
- prod/group_vars/all: restructured from flat files to all/ directory;
  added act_runner_labels override (prod-runner,ubuntu-24.04,hostname);
  added storagebox_managed_directories; added swarm_manager_ip and other
  prod-specific vars
- prod/roles/db_stack: prod-specific db_node tasks using StorageBox paths
  (/mnt/storagebox/db/...) instead of local paths
- docker/tasks: split firewalld loop into all-nodes (Swarm ports) and
  app-only (80/443) tasks
- swarm/tasks: added --advertise-addr private_ip to join commands for
  correct multi-homed node advertisement
- hardening/tasks: corrected firewalld drop zone configuration
- node_dirs/tasks: added /opt/iklimco/vault/data for Vault Raft volume
- db_stack/tasks/app_node: updated stale comment (removed pg-proxy reference)
- db_stack/templates: removed pg-proxy and mongo-proxy service blocks
- test/host_vars/iklim-app-01: added act_runner_name override to preserve
  existing test runner registration

Roadmap and setup docs:
- roadmap/03-infra-stack-changes: added replicas:0 for etcd/postgresql/
  mongodb/pg-proxy/mongo-proxy in prod overlay; updated placement table;
  fixed grafana/data mkdir (auto-created by Ansible); translated Turkish
  note to English
- roadmap/08-deploy-pipeline-update: updated stale "remains idle" note
  for standalone etcd (now disabled with replicas:0)
- roadmap/01-swarm-init-multinode: consistency fixes
- setup/06: added Outputs section and etcd firewall port documentation
- setup/07: removed prometheus/data from StorageBox acceptance criteria;
  replaced manual StorageBox mkdir section with Ansible auto-creation note;
  updated prod README section with full bootstrap instructions and vault docs;
  added act_runner_labels prod policy
- setup/08: extensive rewrite — aligned with Patroni etcd overlay DNS,
  corrected hcloud_firewall.app reference, updated all StorageBox paths
  from /prod/db/ to /db/
- setup/09: removed prometheus/data from acceptance criteria; updated
  runner label policy (removed docker/swarm-manager labels); added
  acceptance criterion for disabled services absent from docker service ls

Terraform:
- prod/firewall.tf: added missing DB subnet mutual rules (etcd, Patroni)
- prod/outputs.tf: added prod_floating_ip and prod_private_ips outputs
- prod/servers.tf: aligned placement group and naming
- prod/variables.tf: corrected variable descriptions
- prod/terraform.tfvars.example: updated defaults
- terraform/hetzner/README.md: new comprehensive README covering both
  test and prod environments with firewall tables and inventory instructions

ansible/README.md: expanded prod section with inventory groups, bootstrap
  run order, runner label policy, and vault variable documentation
2026-05-18 19:17:56 +03:00
f4b7f49968 chore: prepare prod ansible and db operations
Add the Ansible README and expand prod bootstrap coverage for StorageBox keys, DB labels, DB stack configuration, and act runner setup. Update MongoDB configuration for replica set support and refresh prod roadmap/setup documentation for Swarm labels, StorageBox-backed cert paths, and recovery guidance.
2026-05-15 20:39:57 +03:00
49ea69d805 feat: provision precipitation storage directory
Create managed StorageBox directories from Ansible and document the precipitation image bind mount required by the test Swarm deployment.
2026-05-14 19:14:53 +03:00
39ffd4a33b feat(ansible/base): configure Hetzner floating IP via systemd service
Add hetzner-floating-ip.service systemd unit to base role so that
the floating IP is bound to eth0 on every boot. The task is
conditional (runs only when hetzner_floating_ip is defined in
host_vars). Add 49.12.116.113 as the floating IP for iklim-app-01
in test host_vars.
2026-05-14 16:13:24 +03:00
f150d93161 fix(firewall): open ports 80 and 443 in firewalld drop zone
The docker role only opened Swarm ports (2377, 7946, 4789).
HTTP and HTTPS were missing, making SWAG unreachable from outside.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 10:57:28 +03:00
9d7da80ffb decreased log verbosity 2026-05-14 00:01:58 +03:00
f6fa947281 Remove iklim-db stack deploy; update Harbor push docs
- ansible: db_stack app_node ve test-db-post-stack'ten artık kullanılmayan stack deploy adımları kaldırıldı (DB servisleri iklimco stack'ine taşındı)
- setup/05: push-harbor-custom-images.sh artık config dosyalarını kendisi üretiyor, init-base.sh ön adımı kaldırıldı
2026-05-13 21:21:22 +03:00
ed51b6eedd feat(vpn): add WireGuard and DB proxy services for secure management
- Add new Ansible role `wireguard` to set up WireGuard VPN server on
  DB node with key generation, firewalld rules, and client peer config.
- Introduce `pg-proxy` and `mongo-proxy` socat containers in db_stack
  to expose PostgreSQL (15432) and MongoDB (17017) on host ports,
  restricted to WireGuard subnet via firewalld.
- Update test environment group_vars with WireGuard client entry for
  `murat-inspiron-15-3525`.
- Modify act_runner config: set `docker_host` to unix socket, remove
  explicit socket mount from options, and change runner label image to
  `catthehacker/ubuntu:act-22.04`.
- Open UDP port 51820 in Hetzner firewall for WireGuard inbound.
- Adjust test-db-post-stack playbook to include wireguard role (tagged).
- Update roadmap document with APISIX init step order.
2026-05-13 18:50:07 +03:00
5fe57ee108 Implement: Declarative act_runner configuration and Docker integration
Migrates `act_runner` configuration from shell-generated to an Ansible-templated `config.yaml`. This enables:
- Dynamic label provisioning, including `test-runner:docker://ubuntu:22.04`.
- Explicit configuration for joining the `iklimco-net` overlay network.
- Docker socket mounting for CI/CD jobs to interact with the Docker daemon.

Updates `setup/05-test-runner-ve-deploy-onkosullari.md` and other related documentation to reflect the new automated and integrated runner setup.
2026-05-12 19:49:24 +03:00
2198f932cd Implement: Gitea Actions runner, automated DB stack, and Turkish localization
*   Introduces an Ansible role for installing and registering `act_runner` for Gitea Actions.
*   Automates PostgreSQL and MongoDB deployment on Docker Swarm in the test environment, leveraging Docker named volumes for data persistence.
*   Translates core documentation, including `README.md` and `setup/04-test-db-docker-kurulum.md`, to Turkish.
*   Adds comprehensive documentation for firewall architecture (`facts/firewall.md`) and Docker Swarm node recovery (`facts/swarm-node-recovery.md`).
*   Enhances security hardening by ensuring `fail2ban` is enabled and streamlining admin SSH key management via Ansible.
*   Updates Ansible vault structure to support new secret variables and adds `.vault_pass` to `.gitignore`.
2026-05-12 18:34:24 +03:00
bbeaf97815 Implement: Administrative user, keyboard layout, and Ansible variable refactor
This commit introduces several core configurations and structural improvements:

*   **User Management:** Creates a new `iklim` administrative user with a securely hashed password, enabled by `python3-passlib`.
*   **System Configuration:** Sets the system keyboard layout to Turkish Q (`trq`).
*   **Security Hardening:** Refines firewall rules for SSH using a rich rule and ensures `journald` log limits file creation.
*   **Ansible Variable Management:** Restructures `group_vars` by consolidating global variables into `group_vars/all/vars.yml` and sensitive data into a dedicated `group_vars/all/vault.yml`.
*   **Ansible Compatibility:** Adds `!unsafe` to a `docker info` shell command to prevent future warnings.
2026-05-11 19:00:31 +03:00
f73504c0f2 Implement: Initial Ansible environment bootstrapping and core roles
This commit introduces the foundational Ansible playbooks, roles, and configurations for automated provisioning of both production and test environments.

Key capabilities include:
-   **Base System Setup:** Common packages, timezone, chrony, and hostname.
-   **Security Hardening:** SELinux disable, SSH configuration, `dnf-automatic`, `fail2ban`, `firewalld` setup, and `journald` log limits.
-   **Docker & Swarm:** Docker installation and configuration, Docker Swarm initialization/joining for managers and workers, overlay network creation, and node labeling.
-   **Storage:** Hetzner StorageBox integration using `davfs2`.
-   **Directory Structure:** Creation of application and database-specific directories.

This establishes a comprehensive, automated pipeline for infrastructure deployment and initial configuration.
2026-05-11 17:51:43 +03:00
03ad812512 Refine Hetzner firewall rules and update server types
Overhaul and expand firewall definitions for both `prod` and `test` environments to enable comprehensive inter-subnet communication.

This includes implementing explicit rules supporting:
- Docker Swarm overlay networks between application and database subnets.
- High-availability database clusters (PostgreSQL replication, MongoDB replica sets, Patroni, etcd).
- Internal access for various infrastructure services (Vault, Redis, RabbitMQ, APISIX, Prometheus, Grafana).

All firewall rule descriptions are standardized in English for improved clarity and consistency.

Additionally, update default `server_type_swarm` and `server_type_db` variables to the recommended `CPX` series for both environments. An initial generated Ansible inventory for the test environment is also added.
2026-05-11 14:54:46 +03:00