- Updated roadmap (03-infra-stack-changes.md) to deprecate database proxies in prod. - Detailed direct subnet access via WireGuard for production developers. - Provided multi-host connection parameters for Patroni and MongoDB Replica Sets in setup guide (08-prod-db-cluster-kurulum.md). - Added environment comparison table to developer access guide.
122 lines
4.0 KiB
Markdown
122 lines
4.0 KiB
Markdown
# 07 — Vault: 3-Node Raft Cluster (Prod)
|
|
|
|
## Context
|
|
Vault starts directly as a 3-node Raft cluster in prod. The single-instance phase used in test is skipped.
|
|
|
|
Test used a single Vault instance (file storage, 1 replica on the manager node). Prod goes straight to Raft HA.
|
|
|
|
## Vault service configuration
|
|
|
|
- **Replicas:** 3 (one per service node)
|
|
- **Storage:** Raft integrated storage
|
|
- **Placement:** `node.labels.type == service` (all 3 app nodes)
|
|
- **Cert distribution:** No SSH needed — all nodes mount StorageBox, cert-reloader writes to `SWAG_CERT_DIR=/mnt/storagebox/ssl`, Vault reads from that path on every node
|
|
|
|
### Prerequisites
|
|
|
|
- [ ] All 3 service nodes are running and labeled `type=service`
|
|
- [ ] `/mnt/storagebox/ssl/` directory is mounted and accessible on all 3 app nodes
|
|
- [ ] Vault data directory `/opt/iklimco/vault/data/` exists on all 3 nodes (host path volumes)
|
|
|
|
### Vault service YAML (docker-stack-infra.prod.yml overlay)
|
|
|
|
```yaml
|
|
vault:
|
|
# ... (image, secrets, healthcheck unchanged from base)
|
|
environment:
|
|
VAULT_LOCAL_CONFIG: >-
|
|
{"api_addr":"https://vault.iklim.co:8200",
|
|
"cluster_addr":"https://{{ .Node.Hostname }}:8201",
|
|
"storage":{"raft":{"path":"/vault/file","node_id":"{{ .Node.Hostname }}"}},
|
|
"listener":[{"tcp":{"address":"0.0.0.0:8200",
|
|
"tls_cert_file":"/vault/certs/STAR.iklim.co.full.crt",
|
|
"tls_key_file":"/vault/certs/STAR.iklim.co_key.pem"}}],
|
|
"default_lease_ttl":"168h","max_lease_ttl":"720h","ui":true}
|
|
volumes:
|
|
- /opt/iklimco/vault/data:/vault/file # host path per node
|
|
- ${SWAG_CERT_DIR}:/vault/certs:ro # StorageBox — shared across all nodes, no SSH distribution needed
|
|
deploy:
|
|
mode: replicated
|
|
replicas: 3
|
|
placement:
|
|
max_replicas_per_node: 1
|
|
constraints:
|
|
- node.labels.type == service
|
|
```
|
|
|
|
> `{{ .Node.Hostname }}` is Docker Swarm's Go template for the node hostname —
|
|
> gives each Vault instance a unique `node_id`.
|
|
|
|
## Raft initialization procedure (first deploy)
|
|
|
|
### Step 1 — Deploy the stack
|
|
|
|
```bash
|
|
docker stack deploy -c docker-stack-infra.yml -c docker-stack-infra.prod.yml iklimco
|
|
```
|
|
|
|
All 3 Vault containers start. Only the first one to initialize becomes the leader.
|
|
|
|
### Step 2 — Initialize Vault on the leader (iklim-app-01)
|
|
|
|
```bash
|
|
VAULT_CTR=$(docker ps -q -f name=iklimco_vault)
|
|
docker exec -it "$VAULT_CTR" vault operator init
|
|
```
|
|
|
|
Save the unseal keys and root token securely. Store the unseal key as a Docker secret:
|
|
|
|
```bash
|
|
echo -n "<unseal-key>" | docker secret create vault_unseal_key -
|
|
```
|
|
|
|
### Step 3 — Unseal the leader
|
|
|
|
```bash
|
|
docker exec -it "$VAULT_CTR" vault operator unseal
|
|
```
|
|
|
|
The healthcheck auto-unseals on subsequent restarts via the `vault_unseal_key` secret.
|
|
|
|
### Step 4 — Join remaining nodes to the Raft cluster
|
|
|
|
On iklim-app-02 and iklim-app-03 containers:
|
|
|
|
```bash
|
|
docker exec -it <vault-on-iklim-app-02> vault operator raft join \
|
|
https://vault.iklim.co:8200
|
|
|
|
docker exec -it <vault-on-iklim-app-03> vault operator raft join \
|
|
https://vault.iklim.co:8200
|
|
```
|
|
|
|
Unseal each node after joining:
|
|
|
|
```bash
|
|
docker exec -it <vault-on-iklim-app-02> vault operator unseal
|
|
docker exec -it <vault-on-iklim-app-03> vault operator unseal
|
|
```
|
|
|
|
### Step 5 — Verify cluster
|
|
|
|
```bash
|
|
docker exec "$VAULT_CTR" vault operator raft list-peers
|
|
```
|
|
|
|
Expected: 3 peers, one `leader`, two `follower`.
|
|
|
|
## cert-reloader — no additional changes needed for Raft
|
|
|
|
cert-reloader writes the cert to `SWAG_CERT_DIR=/mnt/storagebox/ssl`.
|
|
Since StorageBox is mounted on all app nodes, every Vault instance already sees the same path.
|
|
|
|
The cert renewal flow works unchanged with Raft:
|
|
```
|
|
cert changed → copy to /mnt/storagebox/ssl/ → docker service update --force iklimco_vault
|
|
Vault (3 replicas) restart → each auto-unseals via healthcheck
|
|
```
|
|
|
|
## Reference
|
|
- Vault Raft storage docs: https://developer.hashicorp.com/vault/docs/configuration/storage/raft
|
|
- Vault Swarm setup: https://manjit28.medium.com/setting-up-a-secure-and-highly-available-hashicorp-vault-cluster-for-secrets-and-certificates-0ce01a370582
|