Murat ÖZDEMİR 392a015b8d fix(vault): Stable Raft cluster formation and reliable multi-node unseal on Docker Swarm
Root cause: Docker Swarm assigns a new random container ID as $HOSTNAME on every
task restart, making node_id, api_addr, and cluster_addr change with each restart.
Vault could not recognize its own Raft data → cluster never reformed after restart.

Fixes:
- docker-stack-vault.yml: add hostname: "vault-{{.Task.Slot}}.iklim.co" so each
  replica gets a stable, slot-based hostname covered by the *.iklim.co wildcard cert.
  Replace STABLE_ID/NODE_ID_PLACEHOLDER logic with a single HOSTNAME_PLACEHOLDER sed.
  Replace single unseal attempt with a retry loop (90×2s) so peer nodes unseal as
  soon as they join Raft, without needing external intervention.
- vault-bootstrap.sh: add ADIM 6b — after rolling restart, wait for Raft leader to
  unseal, wait for all peers to join Raft (vault operator raft list-peers), then
  attempt explicit per-peer unseal via overlay network (best-effort).
  ADIM 4 early-exit now fires N requests to the shared alias; all must return
  Sealed: false before declaring the cluster healthy.
  ADIM 7 polls up to 4 minutes via check_cluster_unsealed (9 shared-alias requests)
  and retries peer unseal on each iteration.
- deploy-prod.yml: health check now fires 9 requests to the shared alias; all must
  return Sealed: false (single-node check was masking partially-sealed clusters).
2026-06-10 18:17:59 +03:00

ADIM 1 — Placeholder secrets oluştur (manager node)

# opsiyonel history reset
history -w && > ~/.bash_history && history -c

echo "bootstrap" | docker secret create vault_transit_unseal_key -
echo "bootstrap" | docker secret create transit_master_token -

ADIM 2 — Stack deploy et

docker node update --label-add vault_transit=true iklim-app-03
docker stack deploy --with-registry-auth -c docker-stack-vault.yml iklimco

Ana vault node'ları transit henüz hazır olmadığı için crash loop'a girer — beklenen durum.

ADIM 3 — Transit vault'u initialize et

# Transit'in hangi node'da çalıştığını bul:
docker service ps iklimco_vault-transit

# O node'a SSH'la, sonra:
docker exec -it $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault operator init -key-shares=1 -key-threshold=1'

# Unseal Key 1 ve Initial Root Token'ı kaydet.

# Unseal Key 1: ........
# 
# Initial Root Token: hvs.xxxxxxxxxx
# 
# Vault initialized with 1 key shares and a key threshold of 1. Please securely
# distribute the key shares printed above. When the Vault is re-sealed,
# restarted, or stopped, you must supply at least 1 of these keys to unseal it
# before it can start servicing requests.
# 
# Vault does not store the generated root key. Without at least 1 keys to
# reconstruct the root key, Vault will remain permanently sealed!
# 
# It is possible to generate new unseal keys, provided you have a quorum of
# existing unseal keys shares. See "vault operator rekey" for more information.

Unseal Key 1: cS0HPNVl8/9r42SXxeq9Y4uokJP886UAeRQ/sBsBFnQ= Initial Root Token: hvs.AReLHEa44pztSLBUqW2djdEv

ADIM 4 — Transit'i manuel unseal et (sadece bu seferlik)

docker exec $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault operator unseal UNSEAL_KEY_1'
Key Value
Seal Type shamir
Initialized true
Sealed false
Total Shares 1
Threshold 1
Version 2.0.1
Build Date 2026-05-19T17:20:48Z
Storage Type file
Cluster Name vault-cluster-5bd8a332
Cluster ID b03a2f93-53b0-d32b-9762-c36a9d45df90
HA Enabled false

ADIM 5 — Transit engine kur

# Policy dosyasını host'ta oluştur, container'a kopyala:
cat > /tmp/autounseal-policy.hcl << 'EOF'
path "transit/encrypt/autounseal" { capabilities = ["update"] }
path "transit/decrypt/autounseal" { capabilities = ["update"] }
EOF

docker cp /tmp/autounseal-policy.hcl \
  $(docker ps -q -f name=iklimco_vault-transit):/tmp/
# Successfully copied 128B (transferred 2.05kB) to 61a136a1c04e:/tmp/

docker exec $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault login ROOT_TOKEN'

Success! You are now authenticated. The token information displayed below is already stored in the token helper. You do NOT need to run "vault login" again. Future Vault requests will automatically use this token.

(token -> root token)

Key Value
token hvs.AReLHEa44pztSLBUqW2djdEv
token_accessor 6w5ZKxbSSP3S5kz4D6luAmjv
token_duration
token_renewable false
token_policies ["root"]
identity_policies []
policies ["root"]
docker exec $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault secrets enable transit'
# Success! Enabled the transit secrets engine at: transit/

docker exec $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault write -f transit/keys/autounseal'
Key Value
allow_plaintext_backup false
auto_rotate_period 0s
deletion_allowed false
derived false
exportable false
imported_key false
keys map[1:1779831017]
latest_version 1
min_available_version 0
min_decryption_version 1
min_encryption_version 0
name autounseal
supports_decryption true
supports_derivation true
supports_encryption true
supports_signing false
type aes256-gcm96
docker exec $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault policy write autounseal /tmp/autounseal-policy.hcl'
# Success! Uploaded policy: autounseal

docker exec $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault token create -policy=autounseal -period=768h -orphan'

(token -> auto unseal token)

Key Value
token hvs.CAESIFqiceeloWSqHszPL8OY9PCFKpNQsh6NXoBxw_Us0w7gGh4KHGh2cy5XWTBXekE1VUNQcGhmNlE4U1F1RVhWOFo
token_accessor mRgwI0az8UZguETf5iqJWXhb
token_duration 768h
token_renewable true
token_policies ["autounseal" "default"]
identity_policies []
policies ["autounseal" "default"]

ADIM 6 — Secrets'ı gerçek değerlerle güncelle (manager node'a dön)

# 6a. Transit unseal key — sırayla: servis'ten çıkar, sil, gerçek değerle oluştur, ekle
docker service update --secret-rm vault_transit_unseal_key iklimco_vault-transit
# iklimco_vault-transit
# overall progress: 1 out of 1 tasks 
# 1/1: running   [==================================================>] 
# verify: Service iklimco_vault-transit converged

docker secret rm vault_transit_unseal_key
# vault_transit_unseal_key

echo "UNSEAL_KEY_1" | docker secret create vault_transit_unseal_key -
docker service update --secret-add vault_transit_unseal_key iklimco_vault-transit
# iklimco_vault-transit
# overall progress: 1 out of 1 tasks 
# 1/1: running   [==================================================>] 
# verify: Service iklimco_vault-transit converged

# 6b. Transit'in unsealed olduğunu doğrula (iklim-app-03'te)
docker exec $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault status'
# Sealed: false olmalı. Eğer hâlâ sealed ise manuel unseal et:
docker exec $(docker ps -q -f name=iklimco_vault-transit) \
  sh -c 'VAULT_ADDR=http://127.0.0.1:8200 vault operator unseal UNSEAL_KEY_1'

# 6c. Autounseal token — ATOMIC SWAP (vault hiç token'sız restart olmaz)
# DIKKAT: --secret-rm ve --secret-add AYNI komutta verilmeli
echo "hvs.AUTOUNSEAL_TOKEN" | docker secret create transit_master_token_v2 -
docker service update \
  --secret-rm transit_master_token \
  --secret-add source=transit_master_token_v2,target=transit_master_token \
  iklimco_vault

ADIM 7 — Ana vault cluster'ı initialize et

# Transit açıldıktan ve vault node'ları stable olduktan sonra (~1-2 dk):

docker service ps iklimco_vault   # vault.1'in hangi node'da olduğunu bul
# O node'a SSH'la, sonra:
docker exec $(docker ps -q -f name=iklimco_vault.1) vault operator init

# Recovery Keys ve Root Token'ı kaydet. Bitti.
Description
No description provided
Readme 71 KiB
Languages
Shell 100%