Murat ÖZDEMİR fd6a0b4f46 docs: fix roadmap inconsistencies between test-env and prod-env
Corrects six documentation files to match the actual deployed pipeline
behavior and align test/prod approaches where they share the same code.

prod-env/02-godaddy-credentials.md
- Step 1: correct secret file from .env.secrets.shared to .env.secrets.swag;
  add clarifying note that .env.secrets.shared holds AppRole/DB secrets
  and must not be used for GoDaddy credentials.
- Step 4: document that GoDaddy A records are now managed automatically
  by the pipeline's 'Update DNS Records' step via the GoDaddy API;
  reference the Gitea variable PROD_FLOATING_IP that must be set once.

prod-env/08-deploy-pipeline-update.md
- Add Step 2 documenting the new 'Update DNS Records' pipeline step
  (GoDaddy API, idempotent check-before-update, requires jq and
  vars.PROD_FLOATING_IP).
- Renumber subsequent steps 3-8 to accommodate the new step.
- Fix DB hostnames in Step 7 (Run Database Init Scripts) from
  iklimco_postgresql/iklimco_mongodb to postgresql/mongodb, matching
  how Swarm overlay DNS resolves service names inside iklimco-net.
- Update context block: correct DB hostname description, replace
  outdated storagebox path note with env-var approach, list new steps.
- Update final step order to 24 steps including the DNS step and
  Release Deploy Lock; mark Wait for etcd as NEW.

prod-env/09-verify.md
- Insert check #2 for the precipitation image directory
  (/mnt/storagebox/precipitation/images) and iklimco_image-data volume
  bind mount, mirroring the equivalent check in test-env/08-verify.md.
- Renumber all subsequent checks (3-12) to maintain sequential ordering.

test-env/03-infra-stack-changes.md
- Update SWAG service volume snippet: replace hardcoded paths
  (swag-vl:/config, /opt/iklimco/swag/dns-conf, /opt/iklimco/swag/site-confs)
  with env-var forms (${SWAG_CONFIG_DIR:-swag-vl}, ${SWAG_DNS_CONF_DIR:-...},
  ${SWAG_SITE_CONFS_DIR:-...}) to match docker-stack-infra.yml.
- Update cert-reloader volume snippet: replace swag-vl and /opt/iklimco/ssl
  with ${SWAG_CONFIG_DIR:-swag-vl} and ${SWAG_CERT_DIR:-/opt/iklimco/ssl},
  enabling StorageBox override in prod without changing the base file.

test-env/04-swag-nginx-configs.md
- Replace RESTRICTED_IP_1/RESTRICTED_IP_2 individual env vars with
  RESTRICTED_IPS (comma-separated CIDR list) in the required-vars section,
  matching env-test/.env and the actual pipeline.
- Update all three IP-restricted template examples (apigw, rabbitmq,
  grafana) from allow ${RESTRICTED_IP_1}; allow ${RESTRICTED_IP_2}; to
  ${RESTRICTED_IPS_BLOCK}, matching the actual .conf.tpl files in the repo.
- Rewrite the deploy step section to match the real pipeline: docker run
  alpine for file writing, RESTRICTED_IPS_BLOCK generation via sed, and
  envsubst with explicit SWAG_VARS filter to protect nginx $upstream_* vars.

test-env/07-deploy-pipeline-update.md
- Step 2 (Prepare SWAG Directories): replace sudo-tee approach with the
  actual docker-run-alpine method used in deploy-test.yml; add nginx
  reload block; update notes to reflect RESTRICTED_IPS_BLOCK generation.
- Step 4 (Re-order): correct step numbering to match actual pipeline
  (21 steps); mark 'Wait for etcd' as already present in pipeline rather
  than a new addition; add Bootstrap Vault TLS Placeholder which was
  missing from the documented order.
2026-05-16 16:52:48 +03:00

3.8 KiB

09 — Verification Checklist (Prod)

Context

Run after a successful prod pipeline deployment.

1 — Swarm cluster health

docker node ls

Expected: 3 managers (Leader + 2 Reachable) for iklim-app-01/02/03, 3 workers (Ready) for iklim-db-01/02/03.

docker service ls --filter label=project=co.iklim

All services show REPLICAS X/X (target met).

2 — Precipitation image directory exists

ls -ld /mnt/storagebox/precipitation/images

Expected: directory exists. This must be created before iklimco_precipitation-service is deployed.

docker volume inspect iklimco_image-data

Expected: Options.device is /mnt/storagebox/precipitation/images.

3 — SWAG cert is valid

docker exec $(docker ps -q -f name=iklimco_swag) certbot certificates

Expected: *.iklim.co, VALID: XX days (Let's Encrypt, not the old manual cert).

TLS check from outside:

echo | openssl s_client -connect api.iklim.co:443 -servername api.iklim.co 2>/dev/null \
  | openssl x509 -noout -subject -dates

Expected: CN=*.iklim.co, notAfter > 2026-07-15 (cert is Let's Encrypt, not expiring old one).

4 — Public API

curl -si https://api.iklim.co/health

HTTP 2xx, no TLS errors.

5 — IP restriction working

From a non-whitelisted IP:

curl -si https://grafana.iklim.co
curl -si https://apigw.iklim.co
curl -si https://rabbitmq.iklim.co

All expected: HTTP 403.

From whitelisted IP (78.187.87.109 or 95.70.151.248):

curl -si https://grafana.iklim.co    # HTTP 200 Grafana
curl -si https://apigw.iklim.co      # HTTP 200 APISIX Dashboard
curl -si https://rabbitmq.iklim.co   # HTTP 200 RabbitMQ Management

6 — Vault not reachable externally

# From outside — must fail
curl -sk --connect-timeout 5 https://<iklim-app-01-public-ip>:8200/v1/sys/health
# Expected: connection refused or timeout
# From inside overlay — must succeed
docker exec $(docker ps -q -f name=iklimco_apisix | head -1) \
  curl -sk https://vault.iklim.co:8200/v1/sys/health
# Expected: {"sealed":false,...}

7 — cert-reloader watching

docker service logs iklimco_cert-reloader --tail 5

Expected: [cert-reloader] started, no errors.

8 — No unexpected published ports

docker service ls --format "{{.Name}}\t{{.Ports}}" \
  --filter label=project=co.iklim

Only iklimco_swag should show *:80->80/tcp, *:443->443/tcp.

9 — DB nodes running correct services

# Patroni (PostgreSQL HA) stack
docker stack services iklim-patroni
docker service ps iklim-patroni_patroni-01
docker service ps iklim-patroni_patroni-02
docker service ps iklim-patroni_patroni-03

# etcd cluster (for Patroni)
docker stack services iklim-etcd

# MongoDB replica set
docker stack services iklim-db
docker service ps iklim-db_mongodb-01
docker service ps iklim-db_mongodb-02
docker service ps iklim-db_mongodb-03

All tasks should show node names matching iklim-db-01, iklim-db-02, or iklim-db-03 with placement constraint role=db.

10 — APISIX replicas

docker service ps iklimco_apisix

Expected: 3 tasks, all Running, on different nodes.

11 — fail2ban active

docker exec $(docker ps -q -f name=iklimco_swag) fail2ban-client status

Expected: multiple jails listed.

12 — Microservice health (post-deploy)

After microservices are deployed (separate pipeline), verify via the public API:

curl -si https://api.iklim.co/v1/weather/current?lat=39&lon=35

Expected: valid JSON weather response.

⚠️ Old cert expiry reminder

The manually managed *.iklim.co cert expires 2026-07-15. SWAG's Let's Encrypt cert auto-renews every ~60 days. After first SWAG cert is confirmed valid, the manual cert in storagebox can be archived and is no longer used.