Skip to content

Mafyuh/iac

Repository files navigation

CD Ansible

Pods  Nodes  Uptime  CPU  RAM  Version  Talos  PVE Version  Flux  Alerts 

Header Image

iac (wip)

This is my homelab infrastructure, defined in code.


Hypervisor OS Tools Networking Misc. Automations
Proxmox Talos Ubuntu Arch NixOS Docker Kubernetes Renovate OpenTofu Packer Ansible Flux Unifi n8n Actions

📖 Overview

This repository contains the IaC (Infrastructure as Code) configuration for my homelab.

My homelab runs two infrastructure stacks: Kubernetes nodes provisioned with Talos Linux, and Proxmox VMs running Docker. All VMs are cloned from templates I created with Packer. My Kubernetes nodes are all defined as code using Talos Linux. I have been migrating my Ubuntu VM's over to NixOS, see Nix config here and going forward all VM's will be NixOS

Everything is containerized — either managed with Docker Compose or orchestrated through Kubernetes. My long-term goal is to move it all to Kubernetes using GitOps practices, and the migration is ongoing. Docker Compose sticks around mainly due to hardware limitations; scaling a homelab Kubernetes cluster means buying alot of hardware.

To automate infrastructure updates, I use Github Actions, which trigger workflows upon changes to this repo. This ensures seamless deployment and maintenance across my homelab:

  • Flux manages Continuous Deployment (CD) for Kubernetes, deployed via Flux Operator.
  • Docker CD Workflow handles Continuous Deployment for Docker services.
  • Renovate keeps services updated by opening PRs for new versions.
  • Ansible is used to execute playbooks on all of my VMs, automating management and configurations

🔒 Security & Networking

For Secret management I use Bitwarden Secrets and their various integrations into the tools used.

Kubernetes is using External Secrets implementation of BWS, not official. BWS Access Key is SOPS encrypted.

GitLeaks makes sure before every commit no secrets are exposed, GitGuardian makes sure to alert me if something slips through GitLeaks.

Each container image is automatically scanned by Trivy, with detected vulnerabilities published to Github Security

I use RackNerd for their very reasonably priced VPS and deploy Docker services that require uptime here. Tailscale is used to connect my home network to the various VPS's securely using Zero Trust architecture.

I use Cloudflare for my DNS provider with Cloudflare Tunnels to expose some of the services to the world. Cloudflare Access is used as Zero Trust for public websites, this is paired with Fail2Ban looking through all my reverse proxy logs for malicious actors who made it through Access and banning them via Cloudflare WAF.

I also utilize Unifi's IDS/IPS for intrusion detection on my home network, and use Wazuh as a SIEM to monitor and generate security alerts across all my hosts.

📊 Monitoring & Observability

I use a combination of Grafana, fluent-bit, VictoriaLogs and Prometheus with various exporters to collect and visualize system metrics, logs, and alerts. This helps maintain visibility into my infrastructure and detect issues proactively.

  • Prometheus – Metrics collection and alerting
  • Victoria Logs – Centralized logging
  • Grafana – Dashboarding and visualization
  • Exporters – Blackbox Exporter, Speedtest Exporter, etc.

☁️ Cloud Dependencies

Although I try to self-host everything I can, my infra still relies on the cloud for certain services.

Service Use Cost
Proton IMAP, SMTP, VPN (Pass once there is Autofill Hotkey) ~$120/yr
Bitwarden Secrets for all tools Free
OneDrive Takes backups of Proxmox VM's, Kubernetes PV's (will migrate to Proton Drive once there's proper Linux support) Free (e5 dev)
Cloudflare Domain, DNS, WAF Free
GitHub Hosting this repo and continuous integration/deployments Free
RackNerd RackNerd VPS, services such as Gotify, Vaultwarden ~$60/yr
Total: ~$15/mo

🧑‍💻 Getting Started

This repo is not structured like a project you can easily replicate. Although if you are new to any of the tools used I encourage you to read through the directories that make up each tool to see how I am using them.

Over time I will try to add more detailed instructions in each directories README.

Some good references for how I learned this stuff (other than RTFM)

Special thank you to @chkpwd for helping me get this started. His repo was the inspiration for this.

🖥️ Hardware

Proof that you don't need expensive new equipment to run infra like mine. Mostly everything here is secondhand, bought over time, totaling less than ~$3k.

Servers
Name Device CPU RAM Storage GPU Purpose
Talos-1 Optiplex 7040 Micro Intel i5-6700t 32GB DDR4 1x1TB SATA SSD 128GB NVME Integrated k8s control-plane
Talos-2 Optiplex 7040 Micro Intel i5-6700t 32GB DDR4 1x1TB SATA SSD 128GB NVME Integrated k8s control-plane
Talos-3 Optiplex 7040 Micro Intel i5-6700t 32GB DDR4 1x1TB SATA SSD 128GB NVME Integrated k8s control-plane
TrueNAS Custom AMD Ryzen 5 5500 32 GB DDR4 1TB NVMe, 4x4TB RAIDZ1 (Media), 2x4TB Mirrored (Backups) Arc A310 NAS + Jellyfin Server
PVE Custom AMD Ryzen 9 5950X 64 GB DDR4 NVMe for boot and VMs Nvidia 1660 6GB Main proxmox node
Pi Raspberry Pi 4 8GB 1TB m.2 SATA SSD w/ USB HAT n/a Home Assistant Server
Proxmox Backup Server Mini-PC Intel N150 8GB 2TB SATA n/a Backup Proxmox VM's
Personal
Name Device CPU RAM Storage GPU Purpose
Gaming PC Custom Intel i7-13700k 64GB DDR5 10TB NVMe Nvidia RTX 5070 Main Machine
Laptop HP 15-eh1097nr AMD Ryzen 7 5700U 32GB DDR4 1TB NVMe Integrated On the go/bed machine
Networking
Name Device Purpose
Switch Unifi Flex 2.5Gb PoE Switch with PoE
Router Unifi Dream Router 7 Router/Firewall
AP U7 Pro XG AP

📌 To-Do

See Project Board

Contributors