Environment_Monitoring/.gitea/workflows/deploy-monitoring-test.yml
Murat ÖZDEMİR 58d5c24f41
Some checks failed
Deploy Environment Monitoring to Production Environment / deploy (push) Failing after 10s
feat(health-agent): add CI/CD pipeline, Uptime Kuma setup, and runtime configuration
Deploy workflows:
- Integrate health-agent build (test) and image promotion (prod) into monitoring stack workflows
- Add storagebox download of health-agent runtime (.env.monitoring.health-agent-runtime → health-agent/.env) and setup (.env.monitoring.health-agent-setup → health-agent/.env.setup) env files
- Add "Run Uptime Kuma Setup" step: runs setup_uptime_kuma.py inside the built image only when uk_tokens.yml is missing, writes tokens to HEALTH_AGENT_CONFIG_GENERATED_DIR (/mnt/storagebox/monitoring/uk_generated)
- Add health-agent/** and health-agent/deploy/prod.env path triggers to test and prod workflows respectively
- Add HARBOR_CI_TOKEN login and HARBOR_PULL_TOKEN login before stack deploy in both workflows
- Source health-agent/.env before docker stack deploy to expose HEALTH_AGENT_CONFIG_GENERATED_DIR

Dockerfile:
- Copy config/ and scripts/ into image so setup_uptime_kuma.py can run inside the container

setup_uptime_kuma.py:
- Load .env and .env.setup automatically via python-dotenv (no manual export needed)
- Write uk_tokens.yml to config/generated/ (aligned with container volume mount)

Health checks:
- PATRONI_HOSTS and VAULT_HOSTS are now configurable via env vars (comma-separated host:port); no code change needed when node count changes
- REDIS_SENTINEL_HOSTS now correctly parses host:port format; default updated to redis-sentinel:26379
- Fix NameError in check_patroni_cluster() caused by leftover node variable after loop refactor
- Remove verify_ssl=False from Vault check; vault.iklim.co has a valid certificate

Ops:
- Add ops/build-and-push-health-agent.sh for manual bypass of CI pipeline
- Add health-agent/deploy/prod.env template for prod image promotion manifest

Project structure:
- Move .env.example and .env.setup.example to health-agent/env-example/ (root .gitignore excludes health-agent/.env*)
- Add root .gitignore: excludes uk_tokens.yml, __pycache__, .venv, and env files
- Remove health-agent/.gitignore (superseded by root .gitignore)
2026-06-26 18:45:17 +03:00

186 lines
8.6 KiB
YAML

name: Deploy Environment Monitoring to Test Environment
on:
push:
branches:
- test
paths:
- 'docker-stack-monitoring.yml'
- 'health-agent/**'
- 'swag/**'
- '.gitea/workflows/deploy-monitoring-test.yml'
jobs:
deploy:
runs-on: test-runner
steps:
- name: Checkout Branch
uses: actions/checkout@v4
- name: Connect Runner to Overlay Network
run: |
docker network connect iklimco-net $(hostname) || true
- name: Install Required Tools
run: |
sudo sed -i 's|http://archive.ubuntu.com/ubuntu|http://mirror.hetzner.com/ubuntu/packages|g' /etc/apt/sources.list.d/ubuntu.sources || true
sudo sed -i 's|http://archive.ubuntu.com/ubuntu|http://mirror.hetzner.com/ubuntu/packages|g' /etc/apt/sources.list || true
sudo sed -i 's|http://security.ubuntu.com/ubuntu|http://mirror.hetzner.com/ubuntu/packages|g' /etc/apt/sources.list.d/ubuntu.sources || true
sudo sed -i 's|http://security.ubuntu.com/ubuntu|http://mirror.hetzner.com/ubuntu/packages|g' /etc/apt/sources.list || true
sudo rm -f /etc/apt/sources.list.d/microsoft-prod.list
sudo rm -f /etc/apt/sources.list.d/git-core-ubuntu-ppa*.list
sudo rm -f /etc/apt/sources.list.d/github_git-lfs.list
sudo apt-get update
sudo apt-get install -y gettext jq
- name: Set up SSH Key and Add to known_hosts
run: |
mkdir -p ~/.ssh
echo "${{ secrets.STORAGEBOX_SSH_PRIV }}" > ~/.ssh/id_ed25519
chmod 600 ~/.ssh/id_ed25519
ssh-keyscan -p 23 ${{ vars.STORAGEBOX_USER }}.your-storagebox.de >> ~/.ssh/known_hosts
- name: Download Deploy Inputs
run: |
source ./common-functions-base.sh
export SPRING_PROFILES_ACTIVE=TEST
rm -f .env .env.secrets.swag health-agent/.env health-agent/.env.setup
scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:test/secrets/iklim.co/.env ./.env
scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:test/secrets/iklim.co/.env.secrets.swag ./.env.secrets.swag
scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:test/secrets/iklim.co/.env.monitoring.health-agent-runtime ./health-agent/.env
scp -P 23 ${{ vars.STORAGEBOX_USER }}@${{ vars.STORAGEBOX_USER }}.your-storagebox.de:test/secrets/iklim.co/.env.monitoring.health-agent-setup ./health-agent/.env.setup
require_env_file ./.env "Main env file"
require_env_file ./.env.secrets.swag "SWAG secrets"
require_env_file ./health-agent/.env "Health-agent runtime env"
require_env_file ./health-agent/.env.setup "Health-agent setup env"
- name: Build and Push Health Agent
run: |
source ./common-functions-base.sh
export SPRING_PROFILES_ACTIVE=TEST
VERSION=$(sed -n 's/^version = "\(.*\)"/\1/p' health-agent/pyproject.toml)
IMAGE_TAG="health-agent:${VERSION}-rc"
IMAGE_FULL="registry.tarla.io/iklimco/${IMAGE_TAG}"
echo "${{ secrets.HARBOR_CI_TOKEN }}" | \
docker login registry.tarla.io -u robot-ci-push-iklimco --password-stdin
docker build -t "${IMAGE_FULL}" health-agent/
docker push "${IMAGE_FULL}"
docker pull -q "${IMAGE_FULL}"
DIGEST=$(docker image inspect "${IMAGE_FULL}" --format '{{index .RepoDigests 0}}')
if grep -q "^IMAGE_HEALTH_AGENT=" .env; then
sed -i "s|^IMAGE_HEALTH_AGENT=.*$|IMAGE_HEALTH_AGENT=${IMAGE_TAG}|" .env
else
echo "IMAGE_HEALTH_AGENT=${IMAGE_TAG}" >> .env
fi
echo "HEALTH_AGENT_IMAGE=${IMAGE_FULL}" >> $GITHUB_ENV
log_message "SUCCESS" "Pushed: ${IMAGE_FULL}"
log_message "INFO" "Promotion manifest — write to health-agent/deploy/prod.env on prod-env branch:"
echo " SOURCE_IMAGE_DIGEST=${DIGEST}"
echo " PROD_IMAGE_TAG=${VERSION}"
- name: Run Uptime Kuma Setup
run: |
source ./common-functions-base.sh
export SPRING_PROFILES_ACTIVE=TEST
source_env_file ./health-agent/.env
mkdir -p "${HEALTH_AGENT_CONFIG_GENERATED_DIR}"
if [ ! -f "${HEALTH_AGENT_CONFIG_GENERATED_DIR}/uk_tokens.yml" ]; then
docker run --rm \
-v "${HEALTH_AGENT_CONFIG_GENERATED_DIR}:/app/config/generated" \
--env-file "$(pwd)/health-agent/.env" \
--env-file "$(pwd)/health-agent/.env.setup" \
"${HEALTH_AGENT_IMAGE}" \
python scripts/setup_uptime_kuma.py
log_message "SUCCESS" "Uptime Kuma setup complete, tokens written to ${HEALTH_AGENT_CONFIG_GENERATED_DIR}"
else
log_message "INFO" "uk_tokens.yml already exists, skipping Uptime Kuma setup"
fi
- name: Deploy Monitoring Stack
run: |
source ./common-functions-base.sh
export SPRING_PROFILES_ACTIVE=TEST
source_env_file ./.env
source_env_file ./health-agent/.env
export HEALTH_AGENT_ENV_FILE="$(pwd)/health-agent/.env"
echo "${{ secrets.HARBOR_PULL_TOKEN }}" | \
docker login registry.tarla.io -u robot-swarm-pull-iklimco --password-stdin
docker stack deploy \
--with-registry-auth \
--resolve-image changed \
-c docker-stack-monitoring.yml \
iklimco-monitoring
- name: Wait for Loki
run: |
source ./common-functions-base.sh
export SPRING_PROFILES_ACTIVE=TEST
for i in $(seq 1 36); do
REPLICAS=$(docker service ls --filter name=iklimco-monitoring_loki --format "{{.Replicas}}" | head -1)
if echo "$REPLICAS" | awk -F'[/ ]' '$1>0 && $1==$2{found=1} END{exit !found}'; then
log_message "SUCCESS" "Loki is ready: $REPLICAS"
exit 0
fi
log_message "INFO" "Loki not ready yet (${REPLICAS:-missing}), waiting 5s..."
sleep 5
done
docker service ps iklimco-monitoring_loki || true
exit 1
- name: Configure SWAG Reverse Proxy
run: |
source ./common-functions-base.sh
export SPRING_PROFILES_ACTIVE=TEST
source_env_file ./.env
source_env_file ./.env.secrets.swag
export PORTAINER_SUBDOMAIN="${PORTAINER_SUBDOMAIN:-portainer-test.iklim.co}"
export RESTRICTED_IPS_BLOCK="$(echo "$RESTRICTED_IPS" | tr ',' '\n' | sed 's|.*| allow &;|')"
SWAG_VARS='${PORTAINER_SUBDOMAIN}${RESTRICTED_IPS_BLOCK}'
for tpl in swag/site-confs/*.conf.tpl; do
fname=$(basename "${tpl%.tpl}")
envsubst "$SWAG_VARS" < "$tpl" | docker run --rm -i \
-v "${SWAG_SITE_CONFS_DIR}:/output" \
alpine sh -c "cat > /output/${fname}"
log_message "SUCCESS" "${fname} written"
done
SWAG_CTR=$(docker ps -q -f name=iklimco_swag 2>/dev/null | head -1)
if [ -n "$SWAG_CTR" ]; then
docker exec "$SWAG_CTR" nginx -t && docker exec "$SWAG_CTR" nginx -s reload
log_message "SUCCESS" "SWAG nginx reloaded"
fi
- name: Update DNS Records
run: |
source ./common-functions-base.sh
export SPRING_PROFILES_ACTIVE=TEST
source_env_file ./.env
source_env_file ./.env.secrets.swag
FLOATING_IP="${{ vars.TEST_FLOATING_IP }}"
DOMAIN="iklim.co"
for record in portainer-test; do
CURRENT=$(curl -s \
-H "Authorization: sso-key ${GODADDY_KEY}:${GODADDY_SECRET}" \
"https://api.godaddy.com/v1/domains/${DOMAIN}/records/A/${record}" \
2>/dev/null | jq -r '.[0].data // empty' 2>/dev/null || true)
if [ "$CURRENT" = "$FLOATING_IP" ]; then
log_message "INFO" "${record}.${DOMAIN} -> ${FLOATING_IP} exists, skipping"
else
curl -sf -X PUT \
-H "Authorization: sso-key ${GODADDY_KEY}:${GODADDY_SECRET}" \
-H "Content-Type: application/json" \
"https://api.godaddy.com/v1/domains/${DOMAIN}/records/A/${record}" \
-d "[{\"data\":\"${FLOATING_IP}\",\"ttl\":600}]"
log_message "SUCCESS" "${record}.${DOMAIN} -> ${FLOATING_IP} added/updated"
fi
done
- name: Verify Deployment
run: |
docker service ps iklimco-monitoring_loki \
--filter "desired-state=running" \
--format "table {{.Name}}\t{{.Node}}\t{{.CurrentState}}\t{{.Image}}" | head -20