Environment_Infrastructure/roadmap/prod-env/03-infra-stack-changes.md
Murat ÖZDEMİR 76f87aa2f9 Integrate DB nodes into Swarm and refine prod service deployment
- Database nodes now join the Docker Swarm as workers with `role=db` labels, allowing Swarm to manage their dedicated services.
- The `docker-stack-infra.yml` has been updated for production to focus solely on application-level infrastructure components.
- Dedicated database services (PostgreSQL, MongoDB, Patroni-etcd) are now explicitly deployed in separate Swarm stacks on `iklim-db-XX` nodes.
- Standardizes node naming conventions (`iklim-app-XX`, `iklim-db-XX`) across the production roadmap documentation.
- Clarifies that the `etcd` service within `docker-stack-infra.yml` is exclusively for APISIX configuration, distinct from Patroni's etcd cluster.
2026-05-11 14:53:21 +03:00

86 lines
2.9 KiB
Markdown

# 03 — docker-stack-infra.yml Changes (Prod)
## Context
- **File:** `docker-stack-infra.yml` (repo root — shared between test and prod)
- All changes from `test-env/03-infra-stack-changes.md` apply here identically.
- **Additional prod-specific changes:**
- Microservices have no constraint (distributed across app nodes by Swarm).
- Replica counts for stateless services are increased.
- **Note:** PostgreSQL and MongoDB are **not** in `docker-stack-infra.yml` for prod. They run on
dedicated DB nodes in separate stacks (`iklim-db` and `iklim-patroni`). See `08-prod-db-cluster-kurulum.md`.
## Step 1 — Apply all test-env changes first
Follow every step in `test-env/03-infra-stack-changes.md`:
- Add `swag` service
- Add `cert-reloader` service
- Remove published ports for vault, apisix, rabbitmq, prometheus, grafana, apisix-dashboard
- Add `swag-vl` volume
## Step 2 — Pin Vault to manager node (initial prod — single instance)
Vault starts as a single instance pinned to the manager node.
Raft cluster migration is handled separately in `07-vault-raft-plan.md`.
```yaml
# Vault placement stays as:
placement:
constraints:
- node.role == manager
```
## Step 3 — Increase APISIX replicas for prod
```yaml
# CHANGE in apisix service deploy block:
mode: replicated
replicas: 2 # was 1
```
APISIX is stateless (config in etcd) — multiple replicas are safe.
Swarm load-balances SWAG's requests across APISIX replicas via VIP.
## Step 4 — etcd: single instance in docker-stack-infra.yml (APISIX config store only)
The `etcd` service in `docker-stack-infra.yml` is used exclusively by APISIX as its configuration
store. It runs as a single instance on a manager node and is separate from the etcd cluster used by
Patroni for PostgreSQL HA.
```yaml
# etcd placement stays as:
placement:
constraints:
- node.role == manager
```
> The 3-node etcd cluster for Patroni/PostgreSQL HA is deployed separately via `08-prod-db-cluster-kurulum.md`
> on the dedicated DB nodes. These are two independent etcd deployments with different purposes.
## Step 5 — Verify the complete file
After all edits, validate the YAML:
```bash
docker stack config -c docker-stack-infra.yml > /dev/null && echo "YAML valid"
```
No output errors = valid.
## Placement summary for prod (docker-stack-infra.yml only)
| Service | Placement |
|---------|-----------|
| swag | `node.role == manager` |
| cert-reloader | `node.role == manager` |
| vault | `node.role == manager` |
| apisix (2 replicas) | no constraint (distributed across app nodes) |
| apisix-dashboard | no constraint |
| redis | `node.role == manager` |
| rabbitmq | `node.role == manager` |
| etcd (APISIX store) | `node.role == manager` |
| prometheus | `node.role == manager` |
| grafana | `node.role == manager` |
> PostgreSQL and MongoDB are deployed in separate DB stacks on `iklim-db-*` nodes.
> See `08-prod-db-cluster-kurulum.md` for those stacks.