2026-05-09 16:26:06 +03:00

121 lines
3.0 KiB
Markdown

# 09 — Verification Checklist (Prod)
## Context
Run after a successful prod pipeline deployment.
## 1 — Swarm cluster health
```bash
docker node ls
```
Expected: 3 managers (`Leader` + 2 `Reachable`), 3 workers (`Ready`).
```bash
docker service ls --filter label=project=co.iklim
```
All services show `REPLICAS X/X` (target met).
## 2 — SWAG cert is valid
```bash
docker exec $(docker ps -q -f name=iklimco_swag) certbot certificates
```
Expected: `*.iklim.co`, `VALID: XX days` (Let's Encrypt, not the old manual cert).
TLS check from outside:
```bash
echo | openssl s_client -connect api.iklim.co:443 -servername api.iklim.co 2>/dev/null \
| openssl x509 -noout -subject -dates
```
Expected: `CN=*.iklim.co`, `notAfter` > 2026-07-15 (cert is Let's Encrypt, not expiring old one).
## 3 — Public API
```bash
curl -si https://api.iklim.co/health
```
HTTP 2xx, no TLS errors.
## 4 — IP restriction working
From a non-whitelisted IP:
```bash
curl -si https://grafana.iklim.co
curl -si https://apigw.iklim.co
curl -si https://rabbitmq.iklim.co
```
All expected: HTTP 403.
From whitelisted IP (78.187.87.109 or 95.70.151.248):
```bash
curl -si https://grafana.iklim.co # HTTP 200 Grafana
curl -si https://apigw.iklim.co # HTTP 200 APISIX Dashboard
curl -si https://rabbitmq.iklim.co # HTTP 200 RabbitMQ Management
```
## 5 — Vault not reachable externally
```bash
# From outside — must fail
curl -sk --connect-timeout 5 https://<service-1-public-ip>:8200/v1/sys/health
# Expected: connection refused or timeout
```
```bash
# From inside overlay — must succeed
docker exec $(docker ps -q -f name=iklimco_apisix | head -1) \
curl -sk https://vault.iklim.co:8200/v1/sys/health
# Expected: {"sealed":false,...}
```
## 6 — cert-reloader watching
```bash
docker service logs iklimco_cert-reloader --tail 5
```
Expected: `[cert-reloader] started`, no errors.
## 7 — No unexpected published ports
```bash
docker service ls --format "{{.Name}}\t{{.Ports}}" \
--filter label=project=co.iklim
```
Only `iklimco_swag` should show `*:80->80/tcp, *:443->443/tcp`.
## 8 — DB nodes running correct services
```bash
docker service ps iklimco_postgres
docker service ps iklimco_mongo
```
Tasks should show node names matching `db-1`, `db-2`, or `db-3`.
## 9 — APISIX replicas
```bash
docker service ps iklimco_apisix
```
Expected: 2 tasks, both `Running`, on different nodes.
## 10 — fail2ban active
```bash
docker exec $(docker ps -q -f name=iklimco_swag) fail2ban-client status
```
Expected: multiple jails listed.
## 11 — Microservice health (post-deploy)
After microservices are deployed (separate pipeline), verify via the public API:
```bash
curl -si https://api.iklim.co/v1/weather/current?lat=39&lon=35
```
Expected: valid JSON weather response.
## ⚠️ Old cert expiry reminder
The manually managed `*.iklim.co` cert expires **2026-07-15**.
SWAG's Let's Encrypt cert auto-renews every ~60 days.
After first SWAG cert is confirmed valid, the manual cert in storagebox can be archived
and is no longer used.