Document and commit the production bootstrap state after the initial Hetzner and Ansible rollout. - switch Ansible prod runbooks to use the shared vault password file - record production admin CIDRs, SSH key path, encrypted group vault, and encrypted per-host vault files - add generated production inventory and the prod setup history notes from the first bootstrap - keep root password login disabled while preserving key-based root access for Ansible bootstrap continuity - document separate Hetzner projects and tokens for test/prod and commit the prod provider lock file - remove the private Redis firewall allowance from the prod Terraform firewall and matching setup docs
10 KiB
07 - Prod Ansible Bootstrap
The purpose of this phase is to prepare the prod machines created by Terraform for Linux, security hardening, Docker, and Swarm. DB cluster software is not installed by this playbook; however, DB nodes join Swarm as workers.
Ansible Installation
Ansible must be installed on the control machine, meaning your own computer. No agent is installed on target servers; SSH access is enough.
Installation by Operating System
-
Ubuntu / Debian:
sudo apt update sudo apt install -y pipx python3-venv pipx ensurepath export PATH="$HOME/.local/bin:$PATH" pipx install --include-deps ansible pipx install ansible-lint -
Fedora / Rocky Linux / RHEL:
sudo dnf install -y pipx python3-virtualenv pipx ensurepath export PATH="$HOME/.local/bin:$PATH" pipx install --include-deps ansible pipx install ansible-lint -
macOS (Homebrew):
brew install ansible -
With Python Pip, on any platform:
pipx install --include-deps ansible pipx install ansible-lint
Additional Python Dependencies
passlib is required on the control machine for the password_hash filter:
pipx inject ansible passlib
If you installed with
pip:pip install passlib
Verify the Installation
Whichever method you used to install it, use the following commands to verify that the installation succeeded:
# Check the Ansible version and configuration paths
ansible --version
# Check which location the Ansible binary is running from
which -a ansible
Running Ansible Commands
All commands must be run from the ansible/prod/ directory. ansible.cfg automatically defines the inventory and roles_path.
0. Install Required Collections Once During Initial Setup
ansible-galaxy collection install -r ../requirements.yml
1. Connection Test (Ping)
ansible all -m ping
2. Run the Bootstrap Playbook
ansible-playbook prod-bootstrap.yml --ask-vault-pass
Note: The --ask-vault-pass parameter asks for the Ansible Vault password; the StorageBox password is decrypted this way.
3. Run Only a Specific Role (Tags)
ansible-playbook prod-bootstrap.yml --tags "hardening" --ask-vault-pass
Target Machines
| Host | Role |
|---|---|
iklim-app-01 |
Swarm manager + app worker |
iklim-app-02 |
Swarm manager + app worker |
iklim-app-03 |
Swarm manager + app worker |
iklim-db-01 |
Manual DB cluster node |
iklim-db-02 |
Manual DB cluster node |
iklim-db-03 |
Manual DB cluster node |
Recommended File Structure
ansible/
prod/
ansible.cfg
inventory/
generated/
prod.yml
group_vars/
all/
vars.yml
vault.yml
prod-bootstrap.yml
roles/
base/
hardening/
docker/
swarm/
node_dirs/
storagebox/
storagebox_ssh_key/
act_runner/
db_stack/
Base Role
Applied to all prod nodes:
- Package cache update
epel-release— installed first as a separate task;fail2ban,davfs2,htop, andbtopdepend on this repo- base packages, after
epel-releaseis active:curlwgetgitjqtarunzipbash-completiongettext— required for envsubst in CI/CD deploy pipelinestreeca-certificatesfail2banchronypython3python3-pippython3-passlib— for thepassword_hashfilter (EPEL)htop— interactive process monitoring (EPEL)btop— resource monitor with graphical interface (EPEL)
- timezone:
Europe/Istanbul - hostname setup
- keyboard layout:
trq(Turkish Q) - chrony/NTP active
Security Hardening Role
Applied to all prod nodes:
- SSH password auth is disabled.
- Root SSH login via password is disabled (
PermitRootLogin prohibit-password); key-based root login remains active so Ansible can connect throughout the bootstrap. - Only SSH key auth remains.
PermitEmptyPasswords noMaxAuthTries 3fail2banis enabled.- Automatic security updates are enabled with
dnf-automatic. - The
iklimsystem user is created and added to thewheelgroup; the password is read from vault. firewallddefault: incoming deny (drop zone), outgoing allow.- The SSH rule is first written as a rich rule to the
dropzone, then the default zone is set todrop. - SSH is opened only from the admin CIDR.
- DB ports are not opened publicly.
The Hetzner Cloud Firewall is considered the actual perimeter. firewalld is the second defense layer on the host.
Docker Role
Required on all prod nodes, both app and db. Because DB nodes join the network as Swarm Workers, Docker Engine must be installed on every machine.
Packages to install:
docker-cedocker-ce-clicontainerd.iodocker-buildx-plugindocker-compose-plugin
Installation will be done through the official Docker dnf repository (https://download.docker.com/linux/rhel/docker-ce.repo).
Swarm Role
Prod Swarm will be set up with 3 managers:
docker swarm initoniklim-app-01(Advertise/data path addr:10.20.10.11)iklim-app-02andiklim-app-03join as managers.iklim-db-01/02/03join as workers.- Overlay network is created:
iklimco-net - Node labels:
iklim-app-*->type=serviceiklim-db-*->role=db,db-index=01/02/03, for Patroni node coordination
- All nodes remain
AVAILABILITY=Active.
The db-index labels are added through iklim-app-01 in a separate play inside prod-bootstrap.yml, not by the swarm role.
Node Directory Role
On all iklim-app-* nodes:
/opt/iklimco/ssl
/opt/iklimco/init
/opt/iklimco/stacks
/opt/iklimco/vault/data
/opt/iklimco/vault/data is the host path volume of the Vault Raft node; it must be created separately on every app node. Swarm does not manage this directory as an overlay volume; if it is missing, the Vault container will not start.
On DB nodes:
/opt/iklimco/db
/opt/iklimco/backup
StorageBox DAVFS Mount Role
Applied to every node, all iklim-app-* and iklim-db-*.
Prod Sub-Account
| Parameter | Variable | Value |
|---|---|---|
| Main account | storagebox_account |
u469968 |
| Sub-account | storagebox_user |
u469968-sub5 |
| WebDAV URL | storagebox_url |
https://u469968-sub5.your-storagebox.de/ |
| Mount point | storagebox_mount_point |
/mnt/storagebox |
StorageBox SSH Key Role
Applied to every node. The /root/.ssh/id_ed25519_storagebox ed25519 key pair is generated on the server. Uploading the generated public key to the StorageBox main account (SSH authorized_keys) is a separate manual step:
# For each node:
cat /root/.ssh/id_ed25519_storagebox.pub | \
ssh -p 23 STORAGEBOX_USER@STORAGEBOX_USER.your-storagebox.de \
"cat >> .ssh/authorized_keys"
Act Runner Role
Applied to iklim-app-* nodes. Gitea Act Runner is installed on each app node and started as a systemd service. In prod, the runner runs on 3 app nodes; the deploy pipeline can be triggered on any of these runners.
DB Stack Role
Applied to iklim-db-* nodes. On each DB node, it creates /opt/iklimco/db and /opt/iklimco/backup directories, as well as a local reference directory for MongoDB. The actual production configuration, including node-specific mongod.conf, replica set auth key, and Patroni configurations, is set up on StorageBox at /mnt/storagebox/db/mongodb-0X/config/ and /mnt/storagebox/db/postgresql-0X/config/ in the 08-prod-db-cluster-kurulum.md step. etcd data is stored on local Docker named volumes (not StorageBox).
/opt/iklimco/stacks/.env
Password variables required by the DB cluster stacks are stored in the /opt/iklimco/stacks/.env file. This file is stored on StorageBox as prod/secrets/iklim.co/.env.stacks. Before the first deploy, it is fetched on iklim-app-01 with the following command:
scp -P 23 STORAGEBOX_USER@STORAGEBOX_USER.your-storagebox.de:prod/secrets/iklim.co/.env.stacks \
/opt/iklimco/stacks/.env
chmod 600 /opt/iklimco/stacks/.env
StorageBox Directory Structure
The storagebox Ansible rolü storagebox_managed_directories (group_vars/all/vars.yml) aracılığıyla aşağıdaki dizinleri bootstrap sırasında otomatik oluşturur. Manüel adım gerekmez:
/mnt/storagebox/ssl→SWAG_CERT_DIR/mnt/storagebox/swag/config→SWAG_CONFIG_DIR/mnt/storagebox/swag/site-confs→SWAG_SITE_CONFS_DIR/mnt/storagebox/grafana/data→GRAFANA_DATA_DIR/mnt/storagebox/precipitation/images
StorageBox tüm app node'larında /mnt/storagebox olarak mount edildiğinden dizinler yalnızca bir kez oluşturulur; tüm node'lar ortaklaşa erişir. Prometheus yerel Docker named volume kullanır, StorageBox değil.
Swarm Setup Verification
After bootstrap, check the Swarm status with the following commands:
# 6 nodes: 3 managers (Leader/Reachable), 3 workers (Ready)
docker node ls
# App node label
docker node inspect iklim-app-01 --format '{{.Spec.Labels}}'
# Expected: map[type:service]
# DB node label
docker node inspect iklim-db-01 --format '{{.Spec.Labels}}'
# Expected: map[db-index:01 role:db]
# swarm-init.sh idempotency — do not attempt init again in an already active Swarm
grep -n "swarm init\|swarm join" init/swarm-init.sh
Acceptance Criteria
ansible all -m pingsucceeds.- 3 Swarm manager nodes appear as Leader/Reachable in
docker node ls. - 3 DB nodes appear as Workers in
docker node ls. - Manager quorum is provided: 3 managers, 1 loss tolerated.
- The
iklimco-netoverlay network exists. - Node labels (
type=service,role=db,db-index=01/02/03) are verified with inspect. swarm-init.shdoes not attempt init again in an active Swarm; it is idempotent./mnt/storageboxis mounted on every node.- The
/opt/iklimco/vault/datadirectory exists on every app node. - The
ssl,swag/config,swag/site-confs,grafana/data, andprecipitation/imagesdirectories exist on StorageBox. - The Gitea Act Runner service is running on every app node.
/opt/iklimco/dband/opt/iklimco/backupdirectories exist on DB nodes. Node-specificmongod.confand other DB configurations are created on StorageBox (/mnt/storagebox/db/...) in the08-prod-db-cluster-kurulum.mdstep.- Public firewall allows only
22,80, and443ingress.