- Database nodes now join the Docker Swarm as workers with `role=db` labels, allowing Swarm to manage their dedicated services. - The `docker-stack-infra.yml` has been updated for production to focus solely on application-level infrastructure components. - Dedicated database services (PostgreSQL, MongoDB, Patroni-etcd) are now explicitly deployed in separate Swarm stacks on `iklim-db-XX` nodes. - Standardizes node naming conventions (`iklim-app-XX`, `iklim-db-XX`) across the production roadmap documentation. - Clarifies that the `etcd` service within `docker-stack-infra.yml` is exclusively for APISIX configuration, distinct from Patroni's etcd cluster.
86 lines
2.9 KiB
Markdown
86 lines
2.9 KiB
Markdown
# 03 — docker-stack-infra.yml Changes (Prod)
|
|
|
|
## Context
|
|
- **File:** `docker-stack-infra.yml` (repo root — shared between test and prod)
|
|
- All changes from `test-env/03-infra-stack-changes.md` apply here identically.
|
|
- **Additional prod-specific changes:**
|
|
- Microservices have no constraint (distributed across app nodes by Swarm).
|
|
- Replica counts for stateless services are increased.
|
|
- **Note:** PostgreSQL and MongoDB are **not** in `docker-stack-infra.yml` for prod. They run on
|
|
dedicated DB nodes in separate stacks (`iklim-db` and `iklim-patroni`). See `08-prod-db-cluster-kurulum.md`.
|
|
|
|
## Step 1 — Apply all test-env changes first
|
|
|
|
Follow every step in `test-env/03-infra-stack-changes.md`:
|
|
- Add `swag` service
|
|
- Add `cert-reloader` service
|
|
- Remove published ports for vault, apisix, rabbitmq, prometheus, grafana, apisix-dashboard
|
|
- Add `swag-vl` volume
|
|
|
|
## Step 2 — Pin Vault to manager node (initial prod — single instance)
|
|
|
|
Vault starts as a single instance pinned to the manager node.
|
|
Raft cluster migration is handled separately in `07-vault-raft-plan.md`.
|
|
|
|
```yaml
|
|
# Vault placement stays as:
|
|
placement:
|
|
constraints:
|
|
- node.role == manager
|
|
```
|
|
|
|
## Step 3 — Increase APISIX replicas for prod
|
|
|
|
```yaml
|
|
# CHANGE in apisix service deploy block:
|
|
mode: replicated
|
|
replicas: 2 # was 1
|
|
```
|
|
|
|
APISIX is stateless (config in etcd) — multiple replicas are safe.
|
|
Swarm load-balances SWAG's requests across APISIX replicas via VIP.
|
|
|
|
## Step 4 — etcd: single instance in docker-stack-infra.yml (APISIX config store only)
|
|
|
|
The `etcd` service in `docker-stack-infra.yml` is used exclusively by APISIX as its configuration
|
|
store. It runs as a single instance on a manager node and is separate from the etcd cluster used by
|
|
Patroni for PostgreSQL HA.
|
|
|
|
```yaml
|
|
# etcd placement stays as:
|
|
placement:
|
|
constraints:
|
|
- node.role == manager
|
|
```
|
|
|
|
> The 3-node etcd cluster for Patroni/PostgreSQL HA is deployed separately via `08-prod-db-cluster-kurulum.md`
|
|
> on the dedicated DB nodes. These are two independent etcd deployments with different purposes.
|
|
|
|
## Step 5 — Verify the complete file
|
|
|
|
After all edits, validate the YAML:
|
|
|
|
```bash
|
|
docker stack config -c docker-stack-infra.yml > /dev/null && echo "YAML valid"
|
|
```
|
|
|
|
No output errors = valid.
|
|
|
|
## Placement summary for prod (docker-stack-infra.yml only)
|
|
|
|
| Service | Placement |
|
|
|---------|-----------|
|
|
| swag | `node.role == manager` |
|
|
| cert-reloader | `node.role == manager` |
|
|
| vault | `node.role == manager` |
|
|
| apisix (2 replicas) | no constraint (distributed across app nodes) |
|
|
| apisix-dashboard | no constraint |
|
|
| redis | `node.role == manager` |
|
|
| rabbitmq | `node.role == manager` |
|
|
| etcd (APISIX store) | `node.role == manager` |
|
|
| prometheus | `node.role == manager` |
|
|
| grafana | `node.role == manager` |
|
|
|
|
> PostgreSQL and MongoDB are deployed in separate DB stacks on `iklim-db-*` nodes.
|
|
> See `08-prod-db-cluster-kurulum.md` for those stacks.
|