Murat ÖZDEMİR 5ddba7eba4 docs: update production roadmap for HA Vault and shared storage
- Refactor production setup documentation to reflect a 3-node Vault Raft cluster starting from launch.
- Update all paths to use StorageBox mounts for shared state (SWAG config, TLS certs, Monitoring data).
- Switch Nginx configuration convention from proxy-confs to site-confs to align with SWAG's auto-include behavior.
- Standardize TLS private key extensions to .pem.
- Update node failover and recovery facts to include monitoring services.
- Align deployment pipeline instructions with the latest environment variable-driven approach.
2026-05-16 16:18:21 +03:00

3.3 KiB

08 — Verification Checklist (Test)

Context

Run these checks after a successful pipeline deployment to the test environment.

1 — Swarm services are up

docker service ls --filter label=project=co.iklim

All services should show REPLICAS 1/1.

docker service ps iklimco_swag
docker service ps iklimco_cert-reloader
docker service ps iklimco_vault
docker service ps iklimco_apisix

No tasks in Failed or Rejected state.

2 — Precipitation image directory exists

ls -ld /mnt/storagebox/precipitation/images

Expected: directory exists with 0755 permissions or stricter service-approved permissions before iklimco_precipitation-service is deployed.

docker volume inspect iklimco_image-data

Expected: Options.device is /mnt/storagebox/precipitation/images.

3 — SWAG obtained the cert

docker exec $(docker ps -q -f name=iklimco_swag) \
  certbot certificates

Expected: certificate for *.iklim.co, VALID: XX days.

docker exec $(docker ps -q -f name=iklimco_swag) \
  ls /config/etc/letsencrypt/live/iklim.co/

Expected: fullchain.pem, privkey.pem, cert.pem, chain.pem.

4 — Nginx config is valid

docker exec $(docker ps -q -f name=iklimco_swag) nginx -t

Expected: syntax is ok and test is successful.

5 — Public API endpoint

curl -si https://api-test.iklim.co/health

Expected: HTTP 2xx or APISIX response (not a cert error, not a 502).

TLS cert check:

echo | openssl s_client -connect api-test.iklim.co:443 -servername api-test.iklim.co 2>/dev/null \
  | openssl x509 -noout -subject -dates

Expected: subject=CN=*.iklim.co, dates valid, notAfter > today.

6 — IP-restricted subdomains block non-whitelisted IPs

From a non-whitelisted IP:

curl -si https://grafana-test.iklim.co

Expected: HTTP 403.

From a whitelisted IP (78.187.87.109 or 95.70.151.248):

curl -si https://grafana-test.iklim.co

Expected: HTTP 200 (Grafana login page).

7 — Vault is reachable internally (not externally)

From outside the server:

curl -sk https://vault.iklim.co:8200/v1/sys/health
# or
curl -sk https://<server-public-ip>:8200/v1/sys/health

Expected: connection refused or timeout — Vault must not be reachable externally.

From inside the Swarm (exec into any service container):

docker exec $(docker ps -q -f name=iklimco_apisix | head -1) \
  curl -sk https://vault.iklim.co:8200/v1/sys/health

Expected: JSON response {"sealed":false,...}.

8 — cert-reloader is watching

docker service logs iklimco_cert-reloader --tail 10

Expected: [cert-reloader] started — no errors.

9 — Vault cert path is correct

VAULT_CTR=$(docker ps -q -f name=iklimco_vault)
docker exec "$VAULT_CTR" ls /vault/certs/

Expected: STAR.iklim.co.full.crt and STAR.iklim.co_key.pem.

10 — fail2ban is active (SWAG)

docker exec $(docker ps -q -f name=iklimco_swag) \
  fail2ban-client status

Expected: list of jails including nginx-http-auth, nginx-botsearch, etc.

11 — No services have published unexpected ports

docker service ls --format "{{.Name}}\t{{.Ports}}" \
  --filter label=project=co.iklim

Only iklimco_swag should have published ports (*:80->80, *:443->443). All other services should show empty ports column.