Environment_Infrastructure/setup/03-test-ansible-bootstrap.md
Murat ÖZDEMİR 8780c7c05e docs(db): implement direct cluster access strategy for production
- Updated roadmap (03-infra-stack-changes.md) to deprecate database proxies in prod.
- Detailed direct subnet access via WireGuard for production developers.
- Provided multi-host connection parameters for Patroni and MongoDB Replica Sets in setup guide (08-prod-db-cluster-kurulum.md).
- Added environment comparison table to developer access guide.
2026-05-18 14:25:26 +03:00

484 lines
13 KiB
Markdown

# 03 - Test Ansible Bootstrap
The purpose of this phase is to prepare the test machines created by Terraform for Linux, hardening, Docker, and Swarm. DB software installation is outside this phase.
## Ansible Installation
Ansible must be installed on the control machine, meaning your own computer. No agent is installed on target servers; SSH access is enough.
### Installation by Operating System
- **Ubuntu / Debian:**
```bash
sudo apt update
sudo apt install -y pipx python3-venv
pipx ensurepath
export PATH="$HOME/.local/bin:$PATH"
pipx install --include-deps ansible
```
> Note: The `sudo apt install ansible` command may install old Ansible packages on some Ubuntu/Debian versions. Therefore, the `pipx` method should be preferred for using an up-to-date Ansible version.
- **Fedora / Rocky Linux / RHEL:**
```bash
sudo dnf install -y pipx python3-virtualenv
pipx ensurepath
export PATH="$HOME/.local/bin:$PATH"
pipx install --include-deps ansible
```
- **macOS (Homebrew):**
```bash
brew install ansible
```
- **With Python Pip, on any platform:**
```bash
pipx install --include-deps ansible
```
### Additional Python Dependencies
`passlib` is required on the control machine for the `password_hash` filter:
```bash
pipx inject ansible passlib
```
> If you installed with `pip`: `pip install passlib`
### Verify the Installation
Whichever method you used to install it, use the following commands to verify that the installation succeeded:
```bash
# Check the Ansible version and configuration paths
ansible --version
# Check which location the Ansible binary is running from
which -a ansible
```
## Running Ansible Commands
All commands must be run from the `ansible/test/` directory. `ansible.cfg` automatically defines the inventory and `roles_path`.
### 0. Install Required Collections Once During Initial Setup
```bash
ansible-galaxy collection install -r ../requirements.yml
```
### 1. Connection Test (Ping)
```bash
ansible all -m ping
```
### 2. Run the Bootstrap Playbook
```bash
ansible-playbook test-bootstrap.yml --ask-vault-pass
```
*Note: The `--ask-vault-pass` parameter asks for the Ansible Vault password; the StorageBox password is decrypted this way.*
### 3. Run Only a Specific Role (Tags)
```bash
ansible-playbook test-bootstrap.yml --tags "hardening" --ask-vault-pass
```
## Target Machines
| Host | Role |
| --- | --- |
| `iklim-app-01` | Swarm manager + app worker |
| `iklim-db-01` | OS-hardened DB node for manual DB installation |
## Recommended File Structure
```text
ansible/
test/
ansible.cfg
inventory/
generated/
test.yml
group_vars/
all/
vars.yml
vault.yml
host_vars/
iklim-app-01/
vars.yml # Host-specific variables such as floating IP
vault.yml
iklim-db-01/
vault.yml
test-bootstrap.yml
test-app-post-stack.yml # act_runner installation
test-db-post-stack.yml # db_stack + wireguard installation
roles/
base/
hardening/
docker/
swarm/
node_dirs/
storagebox/
storagebox_ssh_key/
db_stack/ # DB directory and configuration preparation
wireguard/ # WireGuard VPN service (DB node)
act_runner/ # Gitea act_runner installation (app node)
```
## Base Role
Applied to all test nodes:
- `dnf update`
- `epel-release` — installed first as a separate task; `fail2ban`, `davfs2`, `htop`, and `btop` depend on this repo
- base packages, after `epel-release` is active:
- `curl`
- `wget`
- `git`
- `jq`
- `tar`
- `unzip`
- `bash-completion`
- `gettext` — required for envsubst in CI/CD deploy pipelines
- `tree`
- `ca-certificates`
- `fail2ban`
- `chrony`
- `python3`
- `python3-pip`
- `python3-passlib` — for the `password_hash` filter (EPEL)
- `htop` — interactive process monitoring (EPEL)
- `btop` — resource monitor with graphical interface (EPEL)
- timezone: `Europe/Istanbul`
- hostname setup
- keyboard layout: `trq` (Turkish Q)
- controlled reboot if the system requires a reboot
- **Hetzner Floating IP systemd service** (`hetzner-floating-ip`): if `hetzner_floating_ip` is defined in `host_vars`, the IP address is added to `eth0` and automatically restored on reboot (`ip addr replace`)
## Security Hardening Role
Applied to all test nodes:
- SSH password login is disabled.
- Root SSH login is disabled.
- Only SSH key login remains.
- `PermitEmptyPasswords no`
- `MaxAuthTries 3`
- The `fail2ban` SSH jail is enabled.
- Automatic security updates are enabled with `dnf-automatic`.
- The `iklim` system user is created and added to the `wheel` group; the password is read from vault.
- `firewalld` default:
- incoming: deny (drop zone)
- outgoing: allow
- The SSH rule is first written as a rich rule to the `drop` zone, then the default zone is set to `drop`; this removes the lockout risk.
- Public SSH is opened only from the admin CIDR.
### SELinux Decision
Rocky Linux 10 comes in SELinux enforcing mode. Decision: **disabled**.
Rationale:
- Hetzner Cloud firewall (external perimeter) + firewalld (host) provide two layers of network security.
- The Docker + davfs2 + firewalld combination requires additional policy and volume label management in SELinux enforcing mode.
- It was also disabled on the Utils VPS, so consistency is preserved.
```bash
# Inside /etc/selinux/config:
SELINUX=disabled
# The change becomes active after reboot
reboot
```
In Ansible:
```yaml
- name: Disable SELinux
ansible.posix.selinux:
state: disabled
register: selinux_change
- name: Reboot if SELinux state changed
ansible.builtin.reboot:
when: selinux_change.changed
```
### fail2ban Configuration
Content of `/etc/fail2ban/jail.local`:
```ini
[DEFAULT]
ignoreip = 127.0.0.1/8 {{ admin_allowed_cidrs }}
bantime = 21600
findtime = 300
maxretry = 5
banaction = iptables-multiport
backend = systemd
[sshd]
enabled = true
```
- `bantime`: 6-hour ban
- `findtime`: within 5 minutes
- `maxretry`: 5 failed logins -> ban
- `ignoreip`: keeps admin CIDRs exempt from bans
In Ansible, the `admin_allowed_cidrs` list is converted to a space-separated string and written to the template.
Note: Docker iptables rules may interact with firewalld. The Hetzner Cloud firewall is considered the actual external perimeter; firewalld is used as a second layer inside the host.
## Docker Role
Required on both nodes (`iklim-app-01` and `iklim-db-01`). Because the DB node will join the network as a Swarm Worker, Docker Engine must be installed on both machines.
Docker is installed through the official Docker dnf repository:
- Docker GPG key + dnf repository (`https://download.docker.com/linux/rhel/docker-ce.repo`)
- packages:
- `docker-ce`
- `docker-ce-cli`
- `containerd.io`
- `docker-buildx-plugin`
- `docker-compose-plugin`
- Docker service enabled + started
The Docker convenience script will not be used. The package repository path is preferred for a production-like test environment.
## Swarm Role
- Initialized as Swarm Manager on `iklim-app-01`.
- Joined as Swarm Worker on `iklim-db-01`, for overlay network access.
- advertise addr: `10.10.10.11`, for the manager
- overlay network:
- `iklimco-net`
- driver: `overlay`
- attachable: `true`
- Node labels:
- `iklim-app-01`: `type=service` — all infra and application services are deployed to this node
- `iklim-db-01`: `role=db` — PostgreSQL and MongoDB services are deployed to this node
- On `iklim-app-01`, it remains both manager and worker (Active).
## Node Directory Role
Deploy prerequisites on `iklim-app-01`:
```text
/opt/iklimco
/opt/iklimco/ssl
/opt/iklimco/init
/opt/iklimco/init/postgresql
/opt/iklimco/init/mongodb
/opt/iklimco/stacks
```
Minimum for manual DB installation on the DB node:
```text
/opt/iklimco
/opt/iklimco/db
/opt/iklimco/backup
```
## StorageBox DAVFS Mount Role
Applied to both nodes (`iklim-app-01` and `iklim-db-01`).
### Purpose
Mounts Hetzner StorageBox as `/mnt/storagebox` through the WebDAV (DAVFS) protocol. Docker volumes are connected to this directory to provide data persistence and backups.
### Test Environment Sub-Account
| Parameter | Variable | Value |
| --- | --- | --- |
| Main account | `storagebox_account` | `u469968` |
| Sub-account | `storagebox_user` | `u469968-sub4` |
| WebDAV URL | `storagebox_url` | `https://u469968-sub4.your-storagebox.de/` |
| Mount point | `storagebox_mount_point` | `/mnt/storagebox` |
### Role Variables
All variables are defined in `group_vars/all/vars.yml`:
```yaml
storagebox_account: "u469968"
storagebox_user: "{{ storagebox_account }}-sub4"
storagebox_url: "https://{{ storagebox_user }}.your-storagebox.de/"
storagebox_password: "{{ vault_storagebox_password }}"
storagebox_mount_point: "/mnt/storagebox"
storagebox_managed_directories:
- path: "{{ storagebox_mount_point }}/precipitation/images"
mode: "0755"
```
In prod, the suffix changes from `sub4` to `sub5`.
Passwords are stored encrypted with Ansible Vault inside `group_vars/all/vault.yml`:
```bash
ansible-vault edit group_vars/all/vault.yml
```
Content of `vault.yml`:
```yaml
vault_storagebox_password: "SUB_ACCOUNT_PASSWORD"
vault_iklim_password: "IKLIM_USER_PASSWORD"
```
### Steps
1. **Install davfs2**
```yaml
- name: Install davfs2
ansible.builtin.dnf:
name: davfs2
state: present
```
2. **Credentials file** (`/etc/davfs2/secrets`)
```yaml
- name: Configure davfs2 secrets
ansible.builtin.lineinfile:
path: /etc/davfs2/secrets
line: "{{ storagebox_url }} {{ storagebox_user }} {{ storagebox_password }}"
create: yes
mode: "0600"
owner: root
group: root
```
3. **Create mount point**
```yaml
- name: Create mount point
ansible.builtin.file:
path: "{{ storagebox_mount_point }}"
state: directory
mode: "0755"
```
4. **fstab entry**
```yaml
- name: Add fstab entry
ansible.builtin.lineinfile:
path: /etc/fstab
line: >-
{{ storagebox_url }} {{ storagebox_mount_point }} davfs
_netdev,auto,user,rw,uid=root,gid=root 0 0
state: present
```
5. **Mount**
```yaml
- name: Mount StorageBox
ansible.builtin.command: mount {{ storagebox_mount_point }}
args:
creates: "{{ storagebox_mount_point }}/.mounted_marker"
```
A marker file can be written to the directory to confirm mount success:
```yaml
- name: Write mount marker
ansible.builtin.copy:
content: "mounted by ansible"
dest: "{{ storagebox_mount_point }}/.mounted_marker"
```
6. **Create service bind mount directories**
In the test environment, the precipitation service's `image-data` volume is bind mounted on the host to `/mnt/storagebox/precipitation/images`. The directory is created by Ansible after StorageBox is mounted and left with `0755` permissions.
```yaml
- name: Create managed StorageBox directories
ansible.builtin.file:
path: "{{ item.path }}"
state: directory
owner: "{{ item.owner | default(omit) }}"
group: "{{ item.group | default(omit) }}"
mode: "{{ item.mode | default('0755') }}"
loop: "{{ storagebox_managed_directories | default([]) }}"
```
### Notes
- The `davfs2` package is in the EPEL repository; the base role already installs `epel-release`.
- StorageBox passwords are never added to the repository as plaintext; Ansible Vault is mandatory.
- The mount point is automatically mounted after the network is ready on reboot, thanks to the `_netdev` flag.
- Docker Swarm services use service directories under StorageBox as bind mounts.
- The precipitation service's test environment image directory must be `/mnt/storagebox/precipitation/images`; this path must exactly match the `device` value in `BE-Precipitation/docker-stack-service.yml`.
## StorageBox SSH Key Role
Applied to both nodes (`iklim-app-01` and `iklim-db-01`).
### Purpose
An ed25519 SSH key pair is generated on the server and uploaded to the StorageBox main account. This allows CI/CD pipelines to use the `STORAGEBOX_SSH_PRIV` Gitea secret for passwordless access.
### Steps
1. **SSH key generation**
```yaml
- name: Generate SSH key for StorageBox
ansible.builtin.user:
name: root
generate_ssh_key: yes
ssh_key_type: ed25519
ssh_key_file: .ssh/id_ed25519_storagebox
ssh_key_comment: "{{ inventory_hostname }}-storagebox"
```
2. **Upload the public key to StorageBox**
This step is done manually and requires the password the first time:
```bash
cat /root/.ssh/id_ed25519_storagebox.pub | ssh -p23 u469968-sub4@u469968-sub4.your-storagebox.de install-ssh-key
```
Later access works passwordlessly:
```bash
sftp -P23 u469968-sub4@u469968-sub4.your-storagebox.de
```
3. **Add private and public keys to Gitea**
Gitea -> Organization Settings -> Actions -> Secrets:
| Secret Name | Value |
| --- | --- |
| `STORAGEBOX_SSH_PRIV` | Contents of `/root/.ssh/id_ed25519_storagebox` |
| `STORAGEBOX_SSH_PUB` | Contents of `/root/.ssh/id_ed25519_storagebox.pub` |
To get the key contents:
```bash
cat /root/.ssh/id_ed25519_storagebox
cat /root/.ssh/id_ed25519_storagebox.pub
```
### Notes
- A separate key is generated for each server; all public keys are uploaded to the StorageBox main account.
- The private key is never committed to the repo; it is stored only as a Gitea secret.
## Acceptance Criteria