Environment_Infrastructure/roadmap/prod-env/08-deploy-pipeline-update.md
Murat ÖZDEMİR 5ddba7eba4 docs: update production roadmap for HA Vault and shared storage
- Refactor production setup documentation to reflect a 3-node Vault Raft cluster starting from launch.
- Update all paths to use StorageBox mounts for shared state (SWAG config, TLS certs, Monitoring data).
- Switch Nginx configuration convention from proxy-confs to site-confs to align with SWAG's auto-include behavior.
- Standardize TLS private key extensions to .pem.
- Update node failover and recovery facts to include monitoring services.
- Align deployment pipeline instructions with the latest environment variable-driven approach.
2026-05-16 16:18:21 +03:00

245 lines
9.9 KiB
Markdown

# 08 — Deploy Pipeline Update (Prod)
## Context
- **File:** `.gitea/workflows/deploy-prod.yml`
- Same changes as test pipeline (`test-env-setup/07-deploy-pipeline-update.md`),
adapted for prod paths and prod runner.
- **Prod-specific differences from test:**
- `SPRING_PROFILES_ACTIVE=prod` (not `test`) in Run APISIX Init
- DB hostnames use Swarm VIP prefixes: `iklimco_postgresql`, `iklimco_mongodb`
- Storagebox paths use `prod/` instead of `test/`
## Step 1 — Remove manual cert scp lines from `Initialize Servers`
```yaml
# DELETE from "Initialize Servers" step:
scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:prod/app/iklim.co/ssl/STAR.iklim.co.full.crt ./STAR.iklim.co.full.crt
scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:prod/app/iklim.co/ssl/STAR.iklim.co_key.pem ./STAR.iklim.co_key.pem
```
Also remove from `Prepare Init Files`:
```yaml
# DELETE or make conditional:
sudo cp STAR.iklim.co.full.crt STAR.iklim.co_key.pem /opt/iklimco/ssl/
```
## Step 2 — Add `Prepare SWAG Directories` step
Insert **before** `Bootstrap Vault TLS Placeholder`:
```yaml
- name: Prepare SWAG Directories
run: |
set -a; . ./.env; . ./.env.secrets.swag; set +a
mkdir -p "$SWAG_CONFIG_DIR" "$SWAG_DNS_CONF_DIR" "$SWAG_SITE_CONFS_DIR"
envsubst < swag/dns-conf/godaddy.ini.tpl | docker run --rm -i \
-v "${SWAG_DNS_CONF_DIR}:/output" \
alpine sh -c "cat > /output/godaddy.ini && chmod 600 /output/godaddy.ini"
echo "✅ godaddy.ini written"
export RESTRICTED_IPS_BLOCK="$(echo "$RESTRICTED_IPS" | tr ',' '\n' | sed 's|.*| allow &;|')"
SWAG_VARS='${API_SUBDOMAIN}${APIGW_SUBDOMAIN}${GRAFANA_SUBDOMAIN}${RABBITMQ_SUBDOMAIN}${RESTRICTED_IPS_BLOCK}'
for tpl in swag/site-confs/*.conf.tpl; do
fname=$(basename "${tpl%.tpl}")
envsubst "$SWAG_VARS" < "$tpl" | docker run --rm -i \
-v "${SWAG_SITE_CONFS_DIR}:/output" \
alpine sh -c "cat > /output/${fname}"
echo "✅ ${fname}"
done
cat swag/site-confs/default.conf | docker run --rm -i \
-v "${SWAG_SITE_CONFS_DIR}:/output" \
alpine sh -c "cat > /output/default.conf"
echo "✅ SWAG directories ready"
SWAG_CTR=$(docker ps -q -f name=iklimco_swag 2>/dev/null | head -1)
if [ -n "$SWAG_CTR" ]; then
docker exec "$SWAG_CTR" nginx -t && docker exec "$SWAG_CTR" nginx -s reload
echo "✅ SWAG nginx reloaded"
fi
working-directory: /workspace/iklim.co
```
> `.env` is sourced first so `API_SUBDOMAIN=api.iklim.co` (prod values) are used.
> Ensure these vars are in `prod/secrets/iklim.co/.env.prod` on storagebox.
## Step 3 — Add `Wait for etcd` step
Insert **after** `Deploy Swarm Stack` and **before** `Run APISIX Init`.
APISIX reads its entire configuration from etcd; init script will fail silently if etcd is not ready.
```yaml
- name: Wait for etcd
run: |
echo "⏳ Waiting for etcd..."
for i in $(seq 1 30); do
if docker run --rm --network iklimco-net alpine \
sh -c "wget -qO- http://etcd:2379/health 2>/dev/null | grep -q '\"health\":\"true\"'"; then
echo "✅ etcd ready"
break
fi
[ "$i" -eq 30 ] && echo "❌ etcd did not become ready in time" && exit 1
echo " attempt $i/30 — waiting 5s..."
sleep 5
done
```
> **Note:** In prod, the standalone `etcd` service from `docker-stack-infra.yml` still runs (Docker Compose overlay files cannot remove services). APISIX currently uses this etcd; the Patroni etcd migration happens via `docker-stack-infra.prod.yml`. The `http://etcd:2379/health` check targets this standalone service and is correct for the current setup.
## Step 4 — Add `Run APISIX Init` step
Insert **after** `Wait for etcd` and **before** `Bootstrap SWAG Certificate`.
```yaml
- name: Run APISIX Init
run: |
set -a; . ./.env; . ./.env.secrets.shared; set +a
echo "⏳ Waiting for Swarm APISIX..."
until curl -sf -o /dev/null \
-H "X-API-KEY: ${APISIX_ADMIN_KEY}" \
"http://apisix:9180/apisix/admin/upstreams" 2>/dev/null; do
sleep 5
done
export SPRING_PROFILES_ACTIVE=prod
/bin/bash init/apisix-core/init.sh
echo "✅ APISIX routes configured"
working-directory: /workspace/iklim.co
```
> **Prod-specific:** `SPRING_PROFILES_ACTIVE=prod` — test pipeline uses `test`.
> `APISIX_ADMIN_KEY` is sourced from `.env.secrets.shared`.
> The init script is idempotent (PUT semantics); safe to re-run on subsequent deploys.
> With `replicas: 3` in prod, all APISIX instances read the same etcd state — no per-replica init needed.
## Step 5 — Add `Bootstrap SWAG Certificate` step
Insert **after** `Run APISIX Init`:
```yaml
- name: Bootstrap SWAG Certificate
run: |
set -a; . ./.env; set +a
echo "Waiting for SWAG container to start..."
SWAG_CTR=""
for i in $(seq 1 24); do
SWAG_CTR=$(docker ps -q -f name=iklimco_swag 2>/dev/null | head -1)
[ -n "$SWAG_CTR" ] && break
sleep 10
done
if [ -z "$SWAG_CTR" ]; then
echo "❌ SWAG container did not start"
exit 1
fi
CERT_PATH="/config/etc/letsencrypt/live/iklim.co/fullchain.pem"
echo "Waiting for cert (up to 10 min)..."
for i in $(seq 1 20); do
if docker exec "$SWAG_CTR" test -f "$CERT_PATH" 2>/dev/null; then
echo "✅ Cert obtained"
break
fi
echo " attempt $i/20 — waiting 30s..."
sleep 30
done
if ! docker exec "$SWAG_CTR" test -f "$CERT_PATH" 2>/dev/null; then
echo "❌ SWAG did not obtain cert. Logs:"
docker service logs iklimco_swag --tail 50
exit 1
fi
docker exec "$SWAG_CTR" cat "$CERT_PATH" | \
docker run --rm -i -v "${SWAG_CERT_DIR}:/output" alpine \
sh -c "cat > /output/STAR.iklim.co.full.crt && chmod 644 /output/STAR.iklim.co.full.crt"
docker exec "$SWAG_CTR" cat "/config/etc/letsencrypt/live/iklim.co/privkey.pem" | \
docker run --rm -i -v "${SWAG_CERT_DIR}:/output" alpine \
sh -c "cat > /output/STAR.iklim.co_key.pem && chmod 644 /output/STAR.iklim.co_key.pem"
echo "✅ Cert bootstrapped to ${SWAG_CERT_DIR}/"
working-directory: /workspace/iklim.co
```
## Step 6 — Add `Run Database Init Scripts` step
Insert **after** `Bootstrap SWAG Certificate` and **before** `Review Environment`.
```yaml
- name: Run Database Init Scripts
run: |
set -a; . ./.env; . ./.env.secrets.shared; set +a
echo "⏳ Waiting for PostgreSQL..."
until docker run --rm --network iklimco-net \
-e PGPASSWORD="${DATABASE_POSTGRES_ROOT_PASSWD}" \
postgis/postgis:17-3.5 \
pg_isready -h iklimco_postgresql -U "${DATABASE_POSTGRES_ROOT_USER}" -q 2>/dev/null; do
sleep 5
done
for sql_file in $(ls ./init/postgresql/*.sql 2>/dev/null | sort); do
echo "▶ $(basename "$sql_file")"
docker run --rm -i --network iklimco-net \
-e PGPASSWORD="${DATABASE_POSTGRES_ROOT_PASSWD}" \
postgis/postgis:17-3.5 \
psql -h iklimco_postgresql -U "${DATABASE_POSTGRES_ROOT_USER}" < "$sql_file"
done
echo "⏳ Waiting for MongoDB..."
until docker run --rm --network iklimco-net mongo:8 \
mongosh "mongodb://${DATABASE_MONGODB_ROOT_USER}:${DATABASE_MONGODB_ROOT_PASSWD}@iklimco_mongodb/admin" \
--eval "db.runCommand({ping:1})" --quiet 2>/dev/null; do
sleep 5
done
for js_file in $(ls ./init/mongodb/*.js 2>/dev/null | sort); do
echo "▶ $(basename "$js_file")"
docker run --rm -i --network iklimco-net mongo:8 \
mongosh "mongodb://${DATABASE_MONGODB_ROOT_USER}:${DATABASE_MONGODB_ROOT_PASSWD}@iklimco_mongodb/admin" \
--quiet < "$js_file"
done
echo "✅ Database init scripts completed"
working-directory: /workspace/iklim.co
```
> **Prod-specific:** DB hostnames are `iklimco_postgresql` and `iklimco_mongodb` (Swarm VIP service names).
> Test pipeline uses `postgresql` / `mongodb` (unqualified aliases within the same stack).
> SQL and JS files are generated by `Prepare Init Files` step via `init_postgresql` / `init_mongodb` functions in `common-functions.sh`.
> Step is idempotent — scripts use `CREATE IF NOT EXISTS` / `createCollection` semantics.
## Step 7 — Ensure subdomain env vars are in prod `.env`
Add to `prod/secrets/iklim.co/.env.prod` on storagebox:
```bash
API_SUBDOMAIN=api.iklim.co
APIGW_SUBDOMAIN=apigw.iklim.co
RABBITMQ_SUBDOMAIN=rabbitmq.iklim.co
GRAFANA_SUBDOMAIN=grafana.iklim.co
```
## Step 8 — Final step order for prod pipeline
1. Checkout Branch
2. Prepare Folders
3. Set up SSH Key and Add to known_hosts
4. Update Apt Repository and Install Required Tools
5. Fetch Service Secret Files
6. Initialize Servers ← cert scp lines removed
7. Upload Updated Secrets to Storagebox
8. Provision Vault AppRole IDs and Docker Secrets
9. Upload Updated Env to Storagebox
10. Prepare Init Files ← cert copy lines removed
11. Initialize Docker Swarm
12. Stop Docker Compose Services
13. Docker Login to Harbor
14. **Prepare SWAG Directories** ← NEW
15. Bootstrap Vault TLS Placeholder
16. Deploy Swarm Stack
17. **Wait for etcd** ← NEW
18. **Run APISIX Init** ← NEW (`SPRING_PROFILES_ACTIVE=prod`)
19. **Bootstrap SWAG Certificate** ← NEW
20. **Run Database Init Scripts** ← NEW (`iklimco_postgresql`, `iklimco_mongodb`)
21. Review Environment