Environment_Infrastructure/roadmap/prod-env/08-deploy-pipeline-update.md
Murat ÖZDEMİR e3787d80f6 docs(infra): align DB stack and APISIX production guidance
Update Environment_Infrastructure to match the current root stack conventions for database images, shared secret names, and APISIX real IP handling.

- update test Ansible DB image defaults to PostGIS 18/PostGIS 3.6 and MongoDB 8.3.2

- align Patroni configuration with DATABASE_POSTGRES_* secret variable names

- document APISIX real IP template configuration and Harbor rebuild workflow

- replace the separate DB stack env file guidance with the shared .env.secrets.shared flow

- update production setup and roadmap snippets to use current PostGIS, MongoDB, and APISIX rebuild commands
2026-05-20 19:55:49 +03:00

13 KiB

08 — Deploy Pipeline Update (Prod)

Context

  • File: .gitea/workflows/deploy-prod.yml
  • Same changes as test pipeline (test-env-setup/07-deploy-pipeline-update.md), adapted for prod paths and prod runner.
  • Prod-specific differences from test:
    • SPRING_PROFILES_ACTIVE=prod (not test) in Run APISIX Init
    • DB hostnames: postgresql, mongodb (Swarm overlay DNS — same as test)
    • Storagebox paths via env vars (SWAG_CERT_DIR, SWAG_CONFIG_DIR, vb.) instead of local host paths
    • Extra steps: Update DNS Records (GoDaddy API), Wait for etcd

Step 1 — Remove manual cert scp lines from Initialize Workspace

# DELETE from "Initialize Servers" step:
          scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:prod/app/iklim.co/ssl/STAR.iklim.co.full.crt ./STAR.iklim.co.full.crt
          scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:prod/app/iklim.co/ssl/STAR.iklim.co_key.pem ./STAR.iklim.co_key.pem

Also remove from Prepare Init Files:

# DELETE or make conditional:
          sudo cp STAR.iklim.co.full.crt STAR.iklim.co_key.pem /opt/iklimco/ssl/

Step 2 — Add Update DNS Records step

Insert after Docker Login to Harbor and before Prepare SWAG Directories.

      - name: Update DNS Records
        run: |
          set -a; . ./.env; . ./.env.secrets.swag; set +a
          FLOATING_IP="${{ vars.PROD_FLOATING_IP }}"
          DOMAIN="iklim.co"

          for record in api apigw rabbitmq grafana; do
            CURRENT=$(curl -s \
              -H "Authorization: sso-key ${GODADDY_KEY}:${GODADDY_SECRET}" \
              "https://api.godaddy.com/v1/domains/${DOMAIN}/records/A/${record}" \
              2>/dev/null | jq -r '.[0].data // empty' 2>/dev/null || true)

            if [ "$CURRENT" = "$FLOATING_IP" ]; then
              echo "✅ ${record}.${DOMAIN} → ${FLOATING_IP} (exists, skipping)"
            else
              curl -sf -X PUT \
                -H "Authorization: sso-key ${GODADDY_KEY}:${GODADDY_SECRET}" \
                -H "Content-Type: application/json" \
                "https://api.godaddy.com/v1/domains/${DOMAIN}/records/A/${record}" \
                -d "[{\"data\":\"${FLOATING_IP}\",\"ttl\":600}]"
              echo "✅ ${record}.${DOMAIN} → ${FLOATING_IP} (added/updated)"
            fi
          done
        working-directory: /workspace/iklim.co

GODADDY_KEY and GODADDY_SECRET are read from .env.secrets.swag. PROD_FLOATING_IP must be defined as a Gitea project variable (terraform output prod_floating_ip). jq is required — it must have been added to the Update Apt Repository step: apt-get install -y gettext tree jq. Runs on every deploy; existing and correct records are skipped (idempotent).

Step 3 — Add Prepare SWAG Directories step

Insert before Bootstrap Vault TLS Placeholder:

      - name: Prepare SWAG Directories
        run: |
          set -a; . ./.env; . ./.env.secrets.swag; set +a

          mkdir -p "$SWAG_CONFIG_DIR/dns-conf" "$SWAG_SITE_CONFS_DIR"

          envsubst < swag/dns-conf/godaddy.ini.tpl | docker run --rm -i \
            -v "${SWAG_CONFIG_DIR}/dns-conf:/output" \
            alpine sh -c "cat > /output/godaddy.ini && chmod 600 /output/godaddy.ini"
          echo "✅ godaddy.ini written"

          export RESTRICTED_IPS_BLOCK="$(echo "$RESTRICTED_IPS" | tr ',' '\n' | sed 's|.*|        allow &;|')"

          SWAG_VARS='${API_SUBDOMAIN}${APIGW_SUBDOMAIN}${GRAFANA_SUBDOMAIN}${RABBITMQ_SUBDOMAIN}${RESTRICTED_IPS_BLOCK}'
          for tpl in swag/site-confs/*.conf.tpl; do
            fname=$(basename "${tpl%.tpl}")
            envsubst "$SWAG_VARS" < "$tpl" | docker run --rm -i \
              -v "${SWAG_SITE_CONFS_DIR}:/output" \
              alpine sh -c "cat > /output/${fname}"
            echo "✅ ${fname}"
          done

          cat swag/site-confs/default.conf | docker run --rm -i \
            -v "${SWAG_SITE_CONFS_DIR}:/output" \
            alpine sh -c "cat > /output/default.conf"

          echo "✅ SWAG directories ready"

          SWAG_CTR=$(docker ps -q -f name=iklimco_swag 2>/dev/null | head -1)
          if [ -n "$SWAG_CTR" ]; then
            docker exec "$SWAG_CTR" nginx -t && docker exec "$SWAG_CTR" nginx -s reload
            echo "✅ SWAG nginx reloaded"
          fi
        working-directory: /workspace/iklim.co

.env is sourced first so API_SUBDOMAIN=api.iklim.co (prod values) are used. Ensure these vars are in prod/secrets/iklim.co/.env.prod on storagebox.

Step 4 — Add Wait for etcd step

Insert after Deploy Swarm Stack and before Run APISIX Init. APISIX reads its entire configuration from etcd; init script will fail silently if etcd is not ready.

      - name: Wait for etcd
        run: |
          echo "⏳ Waiting for Patroni etcd..."
          for i in $(seq 1 30); do
            if docker run --rm --network iklimco-net alpine \
                sh -c "wget -qO- http://etcd-01:2379/health 2>/dev/null | grep -q '\"health\":\"true\"'"; then
              echo "✅ Patroni etcd ready"
              break
            fi
            [ "$i" -eq 30 ] && echo "❌ Patroni etcd did not become ready in time" && exit 1
            echo "  attempt $i/30 — waiting 5s..."
            sleep 5
          done

Note: In prod, APISIX uses the 3-node Patroni etcd cluster on DB nodes (etcd-01/02/03:2379) via the /apisix prefix — resolved through iklimco-net overlay DNS aliases defined in docker-stack-db.prod.yml. The standalone etcd service from the base stack is disabled (replicas: 0 in the prod overlay) and removed from the service list by a post-deploy step. This step waits for Patroni etcd (etcd-01:2379) to be healthy before running the APISIX init script.

Step 5 — Add Run APISIX Init step

Insert after Wait for etcd and before Bootstrap SWAG Certificate.

      - name: Run APISIX Init
        run: |
          set -a; . ./.env; . ./.env.secrets.shared; set +a
          echo "⏳ Waiting for Swarm APISIX..."
          until curl -sf -o /dev/null \
            -H "X-API-KEY: ${APISIX_ADMIN_KEY}" \
            "http://apisix:9180/apisix/admin/upstreams" 2>/dev/null; do
            sleep 5
          done
          export SPRING_PROFILES_ACTIVE=prod
          /bin/bash init/apisix-core/init.sh
          echo "✅ APISIX routes configured"
        working-directory: /workspace/iklim.co

Prod-specific: SPRING_PROFILES_ACTIVE=prod — test pipeline uses test. APISIX_ADMIN_KEY is sourced from .env.secrets.shared. The init script is idempotent (PUT semantics); safe to re-run on subsequent deploys. With replicas: 3 in prod, all APISIX instances read the same etcd state — no per-replica init needed.

Step 6 — Add Bootstrap SWAG Certificate step

Insert after Run APISIX Init:

      - name: Bootstrap SWAG Certificate
        run: |
          set -a; . ./.env; set +a
          echo "Waiting for SWAG container to start..."
          SWAG_CTR=""
          for i in $(seq 1 24); do
            SWAG_CTR=$(docker ps -q -f name=iklimco_swag 2>/dev/null | head -1)
            [ -n "$SWAG_CTR" ] && break
            sleep 10
          done

          if [ -z "$SWAG_CTR" ]; then
            echo "❌ SWAG container did not start"
            exit 1
          fi

          CERT_PATH="/config/etc/letsencrypt/live/iklim.co/fullchain.pem"
          echo "Waiting for cert (up to 10 min)..."
          for i in $(seq 1 20); do
            if docker exec "$SWAG_CTR" test -f "$CERT_PATH" 2>/dev/null; then
              echo "✅ Cert obtained"
              break
            fi
            echo "  attempt $i/20 — waiting 30s..."
            sleep 30
          done

          if ! docker exec "$SWAG_CTR" test -f "$CERT_PATH" 2>/dev/null; then
            echo "❌ SWAG did not obtain cert. Logs:"
            docker service logs iklimco_swag --tail 50
            exit 1
          fi

          docker exec "$SWAG_CTR" cat "$CERT_PATH" | \
            docker run --rm -i -v "${SWAG_CERT_DIR}:/output" alpine \
              sh -c "cat > /output/STAR.iklim.co.full.crt && chmod 644 /output/STAR.iklim.co.full.crt"
          docker exec "$SWAG_CTR" cat "/config/etc/letsencrypt/live/iklim.co/privkey.pem" | \
            docker run --rm -i -v "${SWAG_CERT_DIR}:/output" alpine \
              sh -c "cat > /output/STAR.iklim.co_key.pem && chmod 644 /output/STAR.iklim.co_key.pem"
          echo "✅ Cert bootstrapped to ${SWAG_CERT_DIR}/"
        working-directory: /workspace/iklim.co

Step 7 — Add Run Database Init Scripts step

Insert after Bootstrap SWAG Certificate and before Review Environment.

      - name: Run Database Init Scripts
        run: |
          set -a; . ./.env; . ./.env.secrets.shared; set +a

          echo "⏳ Waiting for PostgreSQL..."
          until docker run --rm --network iklimco-net \
            -e PGPASSWORD="${DATABASE_POSTGRES_ROOT_PASSWD}" \
            postgis/postgis:18-3.6 \
            pg_isready -h postgresql -U "${DATABASE_POSTGRES_ROOT_USER}" -q 2>/dev/null; do
            sleep 5
          done
          for sql_file in $(ls ./init/postgresql/*.sql 2>/dev/null | sort); do
            echo "▶ $(basename "$sql_file")"
            docker run --rm -i --network iklimco-net \
              -e PGPASSWORD="${DATABASE_POSTGRES_ROOT_PASSWD}" \
              postgis/postgis:18-3.6 \
              psql -h postgresql -U "${DATABASE_POSTGRES_ROOT_USER}" < "$sql_file"
          done

          echo "⏳ Waiting for MongoDB..."
          until docker run --rm --network iklimco-net mongo:8.3.2 \
            mongosh "mongodb://${DATABASE_MONGODB_ROOT_USER}:${DATABASE_MONGODB_ROOT_PASSWD}@mongodb/admin" \
            --eval "db.runCommand({ping:1})" --quiet 2>/dev/null; do
            sleep 5
          done
          for js_file in $(ls ./init/mongodb/*.js 2>/dev/null | sort); do
            echo "▶ $(basename "$js_file")"
            docker run --rm -i --network iklimco-net mongo:8.3.2 \
              mongosh "mongodb://${DATABASE_MONGODB_ROOT_USER}:${DATABASE_MONGODB_ROOT_PASSWD}@mongodb/admin" \
              --quiet < "$js_file"
          done
          echo "✅ Database init scripts completed"
        working-directory: /workspace/iklim.co

Prod-specific: DB hostnames are postgresql and mongodb (Swarm VIP service names). Test pipeline uses postgresql / mongodb (unqualified aliases within the same stack). SQL and JS files are generated by Prepare Init Files step via init_postgresql / init_mongodb functions in common-functions.sh. Step is idempotent — scripts use CREATE IF NOT EXISTS / createCollection semantics.

Step 8 — Microservice prod deploy overlay

Each microservice has its own docker-stack-service.prod.yml overlay file. This file contains prod-specific replicas: 3 and max_replicas_per_node: 1 settings.

In microservice deploy pipelines (deploy-prod.yml), the docker stack deploy command should be:

docker stack deploy \
  -c BE-<ServiceName>/docker-stack-service.yml \
  -c BE-<ServiceName>/docker-stack-service.prod.yml \
  iklimco

For example, for BE-Authentication:

docker stack deploy \
  -c BE-Authentication/docker-stack-service.yml \
  -c BE-Authentication/docker-stack-service.prod.yml \
  iklimco

When a new microservice is added, BE-<ServiceName>/docker-stack-service.prod.yml must be created and the pipeline must include this overlay.

Step 9 — Ensure subdomain env vars are in prod .env

Add to prod/secrets/iklim.co/.env.prod on storagebox:

API_SUBDOMAIN=api.iklim.co
APIGW_SUBDOMAIN=apigw.iklim.co
RABBITMQ_SUBDOMAIN=rabbitmq.iklim.co
GRAFANA_SUBDOMAIN=grafana.iklim.co

Step 10 — Final step order for prod pipeline

To prevent concurrent deploys, a Gitea Actions concurrency block is added per pipeline:

concurrency:
  group: prod-deploy
  cancel-in-progress: false

With cancel-in-progress: false, a new run waits in the queue until the previous one finishes; Gitea UI shows it as "queued" and does not return an error.

  1. Checkout Branch
  2. Prepare Folders
  3. Set up SSH Key and Add to known_hosts
  4. Update Apt Repository and Install Required Tools (gettext tree jq)
  5. Fetch Service Secret Files
  6. Initialize Workspace ← cert scp lines removed
  7. Upload Updated Secrets to Storagebox
  8. Provision Vault AppRole IDs and Docker Secrets
  9. Upload Updated Env to Storagebox
  10. Prepare Init Files ← cert copy lines removed
  11. Initialize Docker Swarm
  12. Docker Login to Harbor
  13. Update DNS Records ← NEW (GoDaddy API, idempotent)
  14. Prepare SWAG Directories ← NEW ($SWAG_CONFIG_DIR/dns-conf; renders nginx conf templates)
  15. Bootstrap Vault TLS Placeholder
  16. Deploy Swarm Stack
  17. Wait for etcd ← NEW (Patroni etcd etcd-01:2379 overlay DNS)
  18. Run APISIX Init ← NEW (SPRING_PROFILES_ACTIVE=prod)
  19. Bootstrap SWAG Certificate ← NEW
  20. Run Database Init Scripts ← NEW (postgresql, mongodb)
  21. Review Environment