Environment_Infrastructure/roadmap/prod-env/08-deploy-pipeline-update.md
Murat ÖZDEMİR 5ddba7eba4 docs: update production roadmap for HA Vault and shared storage
- Refactor production setup documentation to reflect a 3-node Vault Raft cluster starting from launch.
- Update all paths to use StorageBox mounts for shared state (SWAG config, TLS certs, Monitoring data).
- Switch Nginx configuration convention from proxy-confs to site-confs to align with SWAG's auto-include behavior.
- Standardize TLS private key extensions to .pem.
- Update node failover and recovery facts to include monitoring services.
- Align deployment pipeline instructions with the latest environment variable-driven approach.
2026-05-16 16:18:21 +03:00

9.9 KiB

08 — Deploy Pipeline Update (Prod)

Context

  • File: .gitea/workflows/deploy-prod.yml
  • Same changes as test pipeline (test-env-setup/07-deploy-pipeline-update.md), adapted for prod paths and prod runner.
  • Prod-specific differences from test:
    • SPRING_PROFILES_ACTIVE=prod (not test) in Run APISIX Init
    • DB hostnames use Swarm VIP prefixes: iklimco_postgresql, iklimco_mongodb
    • Storagebox paths use prod/ instead of test/

Step 1 — Remove manual cert scp lines from Initialize Servers

# DELETE from "Initialize Servers" step:
          scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:prod/app/iklim.co/ssl/STAR.iklim.co.full.crt ./STAR.iklim.co.full.crt
          scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:prod/app/iklim.co/ssl/STAR.iklim.co_key.pem ./STAR.iklim.co_key.pem

Also remove from Prepare Init Files:

# DELETE or make conditional:
          sudo cp STAR.iklim.co.full.crt STAR.iklim.co_key.pem /opt/iklimco/ssl/

Step 2 — Add Prepare SWAG Directories step

Insert before Bootstrap Vault TLS Placeholder:

      - name: Prepare SWAG Directories
        run: |
          set -a; . ./.env; . ./.env.secrets.swag; set +a

          mkdir -p "$SWAG_CONFIG_DIR" "$SWAG_DNS_CONF_DIR" "$SWAG_SITE_CONFS_DIR"

          envsubst < swag/dns-conf/godaddy.ini.tpl | docker run --rm -i \
            -v "${SWAG_DNS_CONF_DIR}:/output" \
            alpine sh -c "cat > /output/godaddy.ini && chmod 600 /output/godaddy.ini"
          echo "✅ godaddy.ini written"

          export RESTRICTED_IPS_BLOCK="$(echo "$RESTRICTED_IPS" | tr ',' '\n' | sed 's|.*|        allow &;|')"

          SWAG_VARS='${API_SUBDOMAIN}${APIGW_SUBDOMAIN}${GRAFANA_SUBDOMAIN}${RABBITMQ_SUBDOMAIN}${RESTRICTED_IPS_BLOCK}'
          for tpl in swag/site-confs/*.conf.tpl; do
            fname=$(basename "${tpl%.tpl}")
            envsubst "$SWAG_VARS" < "$tpl" | docker run --rm -i \
              -v "${SWAG_SITE_CONFS_DIR}:/output" \
              alpine sh -c "cat > /output/${fname}"
            echo "✅ ${fname}"
          done

          cat swag/site-confs/default.conf | docker run --rm -i \
            -v "${SWAG_SITE_CONFS_DIR}:/output" \
            alpine sh -c "cat > /output/default.conf"

          echo "✅ SWAG directories ready"

          SWAG_CTR=$(docker ps -q -f name=iklimco_swag 2>/dev/null | head -1)
          if [ -n "$SWAG_CTR" ]; then
            docker exec "$SWAG_CTR" nginx -t && docker exec "$SWAG_CTR" nginx -s reload
            echo "✅ SWAG nginx reloaded"
          fi
        working-directory: /workspace/iklim.co

.env is sourced first so API_SUBDOMAIN=api.iklim.co (prod values) are used. Ensure these vars are in prod/secrets/iklim.co/.env.prod on storagebox.

Step 3 — Add Wait for etcd step

Insert after Deploy Swarm Stack and before Run APISIX Init. APISIX reads its entire configuration from etcd; init script will fail silently if etcd is not ready.

      - name: Wait for etcd
        run: |
          echo "⏳ Waiting for etcd..."
          for i in $(seq 1 30); do
            if docker run --rm --network iklimco-net alpine \
                sh -c "wget -qO- http://etcd:2379/health 2>/dev/null | grep -q '\"health\":\"true\"'"; then
              echo "✅ etcd ready"
              break
            fi
            [ "$i" -eq 30 ] && echo "❌ etcd did not become ready in time" && exit 1
            echo "  attempt $i/30 — waiting 5s..."
            sleep 5
          done

Note: In prod, the standalone etcd service from docker-stack-infra.yml still runs (Docker Compose overlay files cannot remove services). APISIX currently uses this etcd; the Patroni etcd migration happens via docker-stack-infra.prod.yml. The http://etcd:2379/health check targets this standalone service and is correct for the current setup.

Step 4 — Add Run APISIX Init step

Insert after Wait for etcd and before Bootstrap SWAG Certificate.

      - name: Run APISIX Init
        run: |
          set -a; . ./.env; . ./.env.secrets.shared; set +a
          echo "⏳ Waiting for Swarm APISIX..."
          until curl -sf -o /dev/null \
            -H "X-API-KEY: ${APISIX_ADMIN_KEY}" \
            "http://apisix:9180/apisix/admin/upstreams" 2>/dev/null; do
            sleep 5
          done
          export SPRING_PROFILES_ACTIVE=prod
          /bin/bash init/apisix-core/init.sh
          echo "✅ APISIX routes configured"
        working-directory: /workspace/iklim.co

Prod-specific: SPRING_PROFILES_ACTIVE=prod — test pipeline uses test. APISIX_ADMIN_KEY is sourced from .env.secrets.shared. The init script is idempotent (PUT semantics); safe to re-run on subsequent deploys. With replicas: 3 in prod, all APISIX instances read the same etcd state — no per-replica init needed.

Step 5 — Add Bootstrap SWAG Certificate step

Insert after Run APISIX Init:

      - name: Bootstrap SWAG Certificate
        run: |
          set -a; . ./.env; set +a
          echo "Waiting for SWAG container to start..."
          SWAG_CTR=""
          for i in $(seq 1 24); do
            SWAG_CTR=$(docker ps -q -f name=iklimco_swag 2>/dev/null | head -1)
            [ -n "$SWAG_CTR" ] && break
            sleep 10
          done

          if [ -z "$SWAG_CTR" ]; then
            echo "❌ SWAG container did not start"
            exit 1
          fi

          CERT_PATH="/config/etc/letsencrypt/live/iklim.co/fullchain.pem"
          echo "Waiting for cert (up to 10 min)..."
          for i in $(seq 1 20); do
            if docker exec "$SWAG_CTR" test -f "$CERT_PATH" 2>/dev/null; then
              echo "✅ Cert obtained"
              break
            fi
            echo "  attempt $i/20 — waiting 30s..."
            sleep 30
          done

          if ! docker exec "$SWAG_CTR" test -f "$CERT_PATH" 2>/dev/null; then
            echo "❌ SWAG did not obtain cert. Logs:"
            docker service logs iklimco_swag --tail 50
            exit 1
          fi

          docker exec "$SWAG_CTR" cat "$CERT_PATH" | \
            docker run --rm -i -v "${SWAG_CERT_DIR}:/output" alpine \
              sh -c "cat > /output/STAR.iklim.co.full.crt && chmod 644 /output/STAR.iklim.co.full.crt"
          docker exec "$SWAG_CTR" cat "/config/etc/letsencrypt/live/iklim.co/privkey.pem" | \
            docker run --rm -i -v "${SWAG_CERT_DIR}:/output" alpine \
              sh -c "cat > /output/STAR.iklim.co_key.pem && chmod 644 /output/STAR.iklim.co_key.pem"
          echo "✅ Cert bootstrapped to ${SWAG_CERT_DIR}/"
        working-directory: /workspace/iklim.co

Step 6 — Add Run Database Init Scripts step

Insert after Bootstrap SWAG Certificate and before Review Environment.

      - name: Run Database Init Scripts
        run: |
          set -a; . ./.env; . ./.env.secrets.shared; set +a

          echo "⏳ Waiting for PostgreSQL..."
          until docker run --rm --network iklimco-net \
            -e PGPASSWORD="${DATABASE_POSTGRES_ROOT_PASSWD}" \
            postgis/postgis:17-3.5 \
            pg_isready -h iklimco_postgresql -U "${DATABASE_POSTGRES_ROOT_USER}" -q 2>/dev/null; do
            sleep 5
          done
          for sql_file in $(ls ./init/postgresql/*.sql 2>/dev/null | sort); do
            echo "▶ $(basename "$sql_file")"
            docker run --rm -i --network iklimco-net \
              -e PGPASSWORD="${DATABASE_POSTGRES_ROOT_PASSWD}" \
              postgis/postgis:17-3.5 \
              psql -h iklimco_postgresql -U "${DATABASE_POSTGRES_ROOT_USER}" < "$sql_file"
          done

          echo "⏳ Waiting for MongoDB..."
          until docker run --rm --network iklimco-net mongo:8 \
            mongosh "mongodb://${DATABASE_MONGODB_ROOT_USER}:${DATABASE_MONGODB_ROOT_PASSWD}@iklimco_mongodb/admin" \
            --eval "db.runCommand({ping:1})" --quiet 2>/dev/null; do
            sleep 5
          done
          for js_file in $(ls ./init/mongodb/*.js 2>/dev/null | sort); do
            echo "▶ $(basename "$js_file")"
            docker run --rm -i --network iklimco-net mongo:8 \
              mongosh "mongodb://${DATABASE_MONGODB_ROOT_USER}:${DATABASE_MONGODB_ROOT_PASSWD}@iklimco_mongodb/admin" \
              --quiet < "$js_file"
          done
          echo "✅ Database init scripts completed"
        working-directory: /workspace/iklim.co

Prod-specific: DB hostnames are iklimco_postgresql and iklimco_mongodb (Swarm VIP service names). Test pipeline uses postgresql / mongodb (unqualified aliases within the same stack). SQL and JS files are generated by Prepare Init Files step via init_postgresql / init_mongodb functions in common-functions.sh. Step is idempotent — scripts use CREATE IF NOT EXISTS / createCollection semantics.

Step 7 — Ensure subdomain env vars are in prod .env

Add to prod/secrets/iklim.co/.env.prod on storagebox:

API_SUBDOMAIN=api.iklim.co
APIGW_SUBDOMAIN=apigw.iklim.co
RABBITMQ_SUBDOMAIN=rabbitmq.iklim.co
GRAFANA_SUBDOMAIN=grafana.iklim.co

Step 8 — Final step order for prod pipeline

  1. Checkout Branch
  2. Prepare Folders
  3. Set up SSH Key and Add to known_hosts
  4. Update Apt Repository and Install Required Tools
  5. Fetch Service Secret Files
  6. Initialize Servers ← cert scp lines removed
  7. Upload Updated Secrets to Storagebox
  8. Provision Vault AppRole IDs and Docker Secrets
  9. Upload Updated Env to Storagebox
  10. Prepare Init Files ← cert copy lines removed
  11. Initialize Docker Swarm
  12. Stop Docker Compose Services
  13. Docker Login to Harbor
  14. Prepare SWAG Directories ← NEW
  15. Bootstrap Vault TLS Placeholder
  16. Deploy Swarm Stack
  17. Wait for etcd ← NEW
  18. Run APISIX Init ← NEW (SPRING_PROFILES_ACTIVE=prod)
  19. Bootstrap SWAG Certificate ← NEW
  20. Run Database Init Scripts ← NEW (iklimco_postgresql, iklimco_mongodb)
  21. Review Environment