26 Commits

Author SHA1 Message Date
8875af8e8a docs: fix roadmap and setup reference direction
Remove setup runbook references from prod roadmap docs so roadmap remains design intent only. Keep setup-to-roadmap links, but normalize them to explicit relative paths.
2026-06-15 19:57:21 +03:00
67f4c10c93 docs(roadmap): update various roadmap docs to align with latest infrastructure setup
- Synchronized swarm initialization, pipeline update, and certificate reloader instructions with the new monolithic stack logic and Ansible roles.
2026-06-15 16:48:04 +03:00
67dc2986dd docs(infra): restructure and update infrastructure setup documentation
- Anglicized setup and facts markdown file names for better consistency.

- Updated 01-swarm-init-multinode.md to highlight Ansible automation of Swarm initialization and labeling.

- Overhauled 03-infra-stack-changes.md to describe the single monolithic file strategy and reflect current Redis, RabbitMQ, and etcd cluster configurations.

- Fixed minor overrides and typos in Patroni templates and Ansible bootstrap documents.

- Restructured README and roadmap mapping to align with the renamed setup documents.
2026-06-15 16:42:18 +03:00
6f9d0d1588 feat(infra): Standardize StorageBox permissions and refactor DB stack name
- Ensure consistent directory and file permissions on StorageBox mounts for improved container access across application and database services.
- Introduce application-specific `storagebox_uid`/`gid` variables for more granular ownership control.
- Enhance StorageBox mount reliability by adding systemd reload and remount handlers for configuration changes.
- Add root credentials to Patroni's etcd configuration for authenticated communication.
- Update all relevant documentation and deployment scripts to use the `iklimco` Docker stack name for database services.
- Re-encrypt production vault secrets to include the new etcd password.
2026-05-23 18:11:01 +03:00
f23835a30a docs(config): Update template file paths to use 'template/' subdirectory
Reflects a clearer organization for SWAG configuration templates across all roadmap and setup documentation. This standardizes references to template files by explicitly including the `template/` subdirectory, improving clarity and distinction from generated configuration files.
2026-05-23 14:43:04 +03:00
ff9837ec54 feat(infra): update environment infrastructure configurations
- Synchronized environment-specific settings with the new isolated architecture.
- Updated network and storage definitions to match the latest Swarm stack requirements.
- Harmonized configuration templates for consistent cross-environment deployment.
2026-05-22 21:40:21 +03:00
e3787d80f6 docs(infra): align DB stack and APISIX production guidance
Update Environment_Infrastructure to match the current root stack conventions for database images, shared secret names, and APISIX real IP handling.

- update test Ansible DB image defaults to PostGIS 18/PostGIS 3.6 and MongoDB 8.3.2

- align Patroni configuration with DATABASE_POSTGRES_* secret variable names

- document APISIX real IP template configuration and Harbor rebuild workflow

- replace the separate DB stack env file guidance with the shared .env.secrets.shared flow

- update production setup and roadmap snippets to use current PostGIS, MongoDB, and APISIX rebuild commands
2026-05-20 19:55:49 +03:00
17be81a66e feat(db): align WireGuard DB access with standard ports
- switch WireGuard DB access defaults from proxy ports to 5432/27017

- remove obsolete db stack template for proxy-based DB access

- clean roadmap wording around deprecated DB proxy services
2026-05-19 17:47:23 +03:00
27f4f83f73 docs(prod): resolve cross-layer inconsistencies and complete prod env implementation
Ansible roles:
- act_runner/defaults: set act_runner_name to inventory_hostname (was
  hardcoded to iklim-test-app); added vault_gitea_runner_token to vault.yml
- prod/group_vars/all: restructured from flat files to all/ directory;
  added act_runner_labels override (prod-runner,ubuntu-24.04,hostname);
  added storagebox_managed_directories; added swarm_manager_ip and other
  prod-specific vars
- prod/roles/db_stack: prod-specific db_node tasks using StorageBox paths
  (/mnt/storagebox/db/...) instead of local paths
- docker/tasks: split firewalld loop into all-nodes (Swarm ports) and
  app-only (80/443) tasks
- swarm/tasks: added --advertise-addr private_ip to join commands for
  correct multi-homed node advertisement
- hardening/tasks: corrected firewalld drop zone configuration
- node_dirs/tasks: added /opt/iklimco/vault/data for Vault Raft volume
- db_stack/tasks/app_node: updated stale comment (removed pg-proxy reference)
- db_stack/templates: removed pg-proxy and mongo-proxy service blocks
- test/host_vars/iklim-app-01: added act_runner_name override to preserve
  existing test runner registration

Roadmap and setup docs:
- roadmap/03-infra-stack-changes: added replicas:0 for etcd/postgresql/
  mongodb/pg-proxy/mongo-proxy in prod overlay; updated placement table;
  fixed grafana/data mkdir (auto-created by Ansible); translated Turkish
  note to English
- roadmap/08-deploy-pipeline-update: updated stale "remains idle" note
  for standalone etcd (now disabled with replicas:0)
- roadmap/01-swarm-init-multinode: consistency fixes
- setup/06: added Outputs section and etcd firewall port documentation
- setup/07: removed prometheus/data from StorageBox acceptance criteria;
  replaced manual StorageBox mkdir section with Ansible auto-creation note;
  updated prod README section with full bootstrap instructions and vault docs;
  added act_runner_labels prod policy
- setup/08: extensive rewrite — aligned with Patroni etcd overlay DNS,
  corrected hcloud_firewall.app reference, updated all StorageBox paths
  from /prod/db/ to /db/
- setup/09: removed prometheus/data from acceptance criteria; updated
  runner label policy (removed docker/swarm-manager labels); added
  acceptance criterion for disabled services absent from docker service ls

Terraform:
- prod/firewall.tf: added missing DB subnet mutual rules (etcd, Patroni)
- prod/outputs.tf: added prod_floating_ip and prod_private_ips outputs
- prod/servers.tf: aligned placement group and naming
- prod/variables.tf: corrected variable descriptions
- prod/terraform.tfvars.example: updated defaults
- terraform/hetzner/README.md: new comprehensive README covering both
  test and prod environments with firewall tables and inventory instructions

ansible/README.md: expanded prod section with inventory groups, bootstrap
  run order, runner label policy, and vault variable documentation
2026-05-18 19:17:56 +03:00
8780c7c05e docs(db): implement direct cluster access strategy for production
- Updated roadmap (03-infra-stack-changes.md) to deprecate database proxies in prod.
- Detailed direct subnet access via WireGuard for production developers.
- Provided multi-host connection parameters for Patroni and MongoDB Replica Sets in setup guide (08-prod-db-cluster-kurulum.md).
- Added environment comparison table to developer access guide.
2026-05-18 14:25:26 +03:00
2202e92c2c docs(roadmap): refactor large code block into smaller snippets
- Split single large bash block into four logical sections (Constants, WS, Auth, Global).
- Improved Obsidian rendering compatibility by removing inline explanatory comments from code blocks.
- Enhanced readability with bold subsection headers.
2026-05-18 10:48:09 +03:00
fb80388b97 docs(roadmap): fix bash syntax in rate limit redis snippet
- Removed unnecessary backslashes from RATE_REDIS string definition.
- Optimized quote structure for valid JSON interpolation.
2026-05-18 10:46:18 +03:00
b230d1575e docs(roadmap): include complete rate and connection limit constants
- Expanded the header constants example to cover Global, Auth, and WebSocket limits.
- Provided detailed snippets for using constants in WS_PLUGINS and AUTH_LIMIT blocks.
- Reinforced maintainability standards for the APISIX initialization script.
2026-05-18 10:34:08 +03:00
0302b2e22a docs(roadmap): standardize rate limit configuration with constants
- Recommended defining rate limit thresholds as header constants in init.sh.
- Updated documentation snippet to use GLOBAL_LIMIT_COUNT and GLOBAL_LIMIT_WINDOW variables.
- Improved maintainability for future rate limit adjustments.
2026-05-18 10:23:57 +03:00
507e3a11a1 docs(roadmap): add rabbitmq network aliases for consistent hashing
- Configured 'iklimco-net' aliases for RabbitMQ nodes in prod overlay documentation.
- Updated Step 6 and Step 8 stack snippets to include network aliases and definitions.
- Added a technical note to Step 7 explaining DNS requirements for sticky sessions.
2026-05-18 10:22:14 +03:00
52bd6a59ac docs(roadmap): add sticky session plan for rabbitmq websocket upstream
- Implemented Consistent Hashing (chash) logic in APISIX upstream configuration for prod.
- Added instructions for real IP detection in APISIX configuration template.
- Documented the bypass of Swarm VIP for better session persistence on RabbitMQ nodes.
2026-05-18 09:46:51 +03:00
4c3b7faad6 docs(roadmap): update production environment roadmap and setup guides
- Documented infrastructure changes for Redis Sentinel and RabbitMQ clustering.
- Updated setup guides for Terraform, Ansible, and Swarm node recovery.
- Clarified APISIX rate limit policy and degradation settings.
2026-05-17 18:54:44 +03:00
fd6a0b4f46 docs: fix roadmap inconsistencies between test-env and prod-env
Corrects six documentation files to match the actual deployed pipeline
behavior and align test/prod approaches where they share the same code.

prod-env/02-godaddy-credentials.md
- Step 1: correct secret file from .env.secrets.shared to .env.secrets.swag;
  add clarifying note that .env.secrets.shared holds AppRole/DB secrets
  and must not be used for GoDaddy credentials.
- Step 4: document that GoDaddy A records are now managed automatically
  by the pipeline's 'Update DNS Records' step via the GoDaddy API;
  reference the Gitea variable PROD_FLOATING_IP that must be set once.

prod-env/08-deploy-pipeline-update.md
- Add Step 2 documenting the new 'Update DNS Records' pipeline step
  (GoDaddy API, idempotent check-before-update, requires jq and
  vars.PROD_FLOATING_IP).
- Renumber subsequent steps 3-8 to accommodate the new step.
- Fix DB hostnames in Step 7 (Run Database Init Scripts) from
  iklimco_postgresql/iklimco_mongodb to postgresql/mongodb, matching
  how Swarm overlay DNS resolves service names inside iklimco-net.
- Update context block: correct DB hostname description, replace
  outdated storagebox path note with env-var approach, list new steps.
- Update final step order to 24 steps including the DNS step and
  Release Deploy Lock; mark Wait for etcd as NEW.

prod-env/09-verify.md
- Insert check #2 for the precipitation image directory
  (/mnt/storagebox/precipitation/images) and iklimco_image-data volume
  bind mount, mirroring the equivalent check in test-env/08-verify.md.
- Renumber all subsequent checks (3-12) to maintain sequential ordering.

test-env/03-infra-stack-changes.md
- Update SWAG service volume snippet: replace hardcoded paths
  (swag-vl:/config, /opt/iklimco/swag/dns-conf, /opt/iklimco/swag/site-confs)
  with env-var forms (${SWAG_CONFIG_DIR:-swag-vl}, ${SWAG_DNS_CONF_DIR:-...},
  ${SWAG_SITE_CONFS_DIR:-...}) to match docker-stack-infra.yml.
- Update cert-reloader volume snippet: replace swag-vl and /opt/iklimco/ssl
  with ${SWAG_CONFIG_DIR:-swag-vl} and ${SWAG_CERT_DIR:-/opt/iklimco/ssl},
  enabling StorageBox override in prod without changing the base file.

test-env/04-swag-nginx-configs.md
- Replace RESTRICTED_IP_1/RESTRICTED_IP_2 individual env vars with
  RESTRICTED_IPS (comma-separated CIDR list) in the required-vars section,
  matching env-test/.env and the actual pipeline.
- Update all three IP-restricted template examples (apigw, rabbitmq,
  grafana) from allow ${RESTRICTED_IP_1}; allow ${RESTRICTED_IP_2}; to
  ${RESTRICTED_IPS_BLOCK}, matching the actual .conf.tpl files in the repo.
- Rewrite the deploy step section to match the real pipeline: docker run
  alpine for file writing, RESTRICTED_IPS_BLOCK generation via sed, and
  envsubst with explicit SWAG_VARS filter to protect nginx $upstream_* vars.

test-env/07-deploy-pipeline-update.md
- Step 2 (Prepare SWAG Directories): replace sudo-tee approach with the
  actual docker-run-alpine method used in deploy-test.yml; add nginx
  reload block; update notes to reflect RESTRICTED_IPS_BLOCK generation.
- Step 4 (Re-order): correct step numbering to match actual pipeline
  (21 steps); mark 'Wait for etcd' as already present in pipeline rather
  than a new addition; add Bootstrap Vault TLS Placeholder which was
  missing from the documented order.
2026-05-16 16:52:48 +03:00
5ddba7eba4 docs: update production roadmap for HA Vault and shared storage
- Refactor production setup documentation to reflect a 3-node Vault Raft cluster starting from launch.
- Update all paths to use StorageBox mounts for shared state (SWAG config, TLS certs, Monitoring data).
- Switch Nginx configuration convention from proxy-confs to site-confs to align with SWAG's auto-include behavior.
- Standardize TLS private key extensions to .pem.
- Update node failover and recovery facts to include monitoring services.
- Align deployment pipeline instructions with the latest environment variable-driven approach.
2026-05-16 16:18:21 +03:00
f4b7f49968 chore: prepare prod ansible and db operations
Add the Ansible README and expand prod bootstrap coverage for StorageBox keys, DB labels, DB stack configuration, and act runner setup. Update MongoDB configuration for replica set support and refresh prod roadmap/setup documentation for Swarm labels, StorageBox-backed cert paths, and recovery guidance.
2026-05-15 20:39:57 +03:00
49ea69d805 feat: provision precipitation storage directory
Create managed StorageBox directories from Ansible and document the precipitation image bind mount required by the test Swarm deployment.
2026-05-14 19:14:53 +03:00
ed51b6eedd feat(vpn): add WireGuard and DB proxy services for secure management
- Add new Ansible role `wireguard` to set up WireGuard VPN server on
  DB node with key generation, firewalld rules, and client peer config.
- Introduce `pg-proxy` and `mongo-proxy` socat containers in db_stack
  to expose PostgreSQL (15432) and MongoDB (17017) on host ports,
  restricted to WireGuard subnet via firewalld.
- Update test environment group_vars with WireGuard client entry for
  `murat-inspiron-15-3525`.
- Modify act_runner config: set `docker_host` to unix socket, remove
  explicit socket mount from options, and change runner label image to
  `catthehacker/ubuntu:act-22.04`.
- Open UDP port 51820 in Hetzner firewall for WireGuard inbound.
- Adjust test-db-post-stack playbook to include wireguard role (tagged).
- Update roadmap document with APISIX init step order.
2026-05-13 18:50:07 +03:00
5fe57ee108 Implement: Declarative act_runner configuration and Docker integration
Migrates `act_runner` configuration from shell-generated to an Ansible-templated `config.yaml`. This enables:
- Dynamic label provisioning, including `test-runner:docker://ubuntu:22.04`.
- Explicit configuration for joining the `iklimco-net` overlay network.
- Docker socket mounting for CI/CD jobs to interact with the Docker daemon.

Updates `setup/05-test-runner-ve-deploy-onkosullari.md` and other related documentation to reflect the new automated and integrated runner setup.
2026-05-12 19:49:24 +03:00
bf8f011e43 Restructure setup documentation and refine environment bootstrapping
This commit introduces a reordered and renumbered set of setup documentation files to better reflect the deployment stages for both test and production environments.

Key changes include:
*   A new `setup-vs-roadmap-map.md` file to provide a clear mapping between roadmap tasks and their corresponding setup phases.
*   Significantly expanded Ansible bootstrap documentation for both test and production, detailing Docker, Swarm, security hardening, and StorageBox SSH key management roles.
*   Formalized database Docker and Swarm cluster setup instructions for test and production, including explicit steps for Swarm worker integration of DB nodes.
*   Updated roadmap documentation (`roadmap/prod-env/*`) to align with the refined setup, incorporating correct private IP addresses for Swarm joins, new node labels, and floating IP usage for GoDaddy DNS records.
2026-05-11 17:47:30 +03:00
76f87aa2f9 Integrate DB nodes into Swarm and refine prod service deployment
- Database nodes now join the Docker Swarm as workers with `role=db` labels, allowing Swarm to manage their dedicated services.
- The `docker-stack-infra.yml` has been updated for production to focus solely on application-level infrastructure components.
- Dedicated database services (PostgreSQL, MongoDB, Patroni-etcd) are now explicitly deployed in separate Swarm stacks on `iklim-db-XX` nodes.
- Standardizes node naming conventions (`iklim-app-XX`, `iklim-db-XX`) across the production roadmap documentation.
- Clarifies that the `etcd` service within `docker-stack-infra.yml` is exclusively for APISIX configuration, distinct from Patroni's etcd cluster.
2026-05-11 14:53:21 +03:00
81c38e8d39 initial commit 2026-05-09 16:26:06 +03:00