Infrastructure as Code files for homelab cluster. Mirror of https://gitlab.wuhoo.xyz/jerry/homelab-iac
Infrastructure as Code files for homelab cluster.
Starting with a minimal working Proxmox cluster, Ceph cluster, and openSUSE MicroOS cloud-init image, this IaC repo configures the following:
k3s
${HOME}/.kube/config
to allow local control of the remote cluster via kubectl
kured
k3s
via system-upgrade-controller
flux
gitops (see FluxCD gitops section)terraform
sops
age
(optional, but please use age
instead of PGP)flux2
The following already-existing infrastructure is expected for the IaC to work:
Replace relevant variables in terraform/k3s/main.tf
, terraform/config.sops.yaml
as applicable. Self-explanatory.
The Terraform config is configured to store its state in a remote postgresql cluster. To use your own postgresql cluster, you can initialize the Terraform/Terragrunt with:
terraform init
This will prompt for the postgresql connection string, which will be stored for future use. In case you do not have a postgresql cluster, you can instead store state locally, but be sure to secure this as it contains sensitive data and certainly should not be pushed unencrypted to a git repo. To store the state locally, delete the backend "pg" {}
line in the terraform
block in terraform/main.tf
.
The terraform/k3s/config.sops.yaml
has sensitive variables encrypted with sops
. The sops
encryption expects an age
key at ${HOME}/.config/sops/age/keys.txt
with corresponding public key age145q8qdg9ljfsl88dl3d5j9qqcq62nhev49eyqj30ssl5ryqc5vgssrmuau
. If you do not have this key, you can delete the sops
metadata section in terraform/config.yaml
and replace the encrypted sensitive data with unencrypted secrets. Then, re-encrypt in place with:
sops --encrypted-regex "(api.*|macaddr|username|password|connection_url)" --encrypt --in-place terraform/k3s/config.sops.yaml
If you're running a Ceph cluster hyperconverged on Proxmox cluster, then it's convenient to wait for Ceph intialization before attempting VM autostart.
pvenode config set --startall-onboot-delay 180
k3s
cluster$ cd terraform/k3s
$ terraform init
$ terraform apply
That's it!
Let's say you want to redeploy a node for any reason (perhaps it's unhealthy). Then:
Log in to the PVE UI and delete the VM from the UI.
Recreate and rejoin the node to the Kubernetes infrastructure:
terraform apply
Repeat steps 1-2 for each node in the cluster as necessary.
IT IS IMPORTANT THAT YOU RUN EACH STEP SEQUENTIALLY. DO NOT RUN IN PARALLEL.
Deployments and services are managed by FluxCD, in the fluxcd
directory. In order to use the CD managed by FluxCD, the cluster expects the correct SOPS key in the cluster as follows:
kubectl create namespace flux-system && kubectl -n flux-system create secret generic sops-age --from-file=keys.agekey=${HOME}/.config/sops/age/keys.txt
Then, the cluster can be bootstrapped with FluxCD with:
export GITLAB_TOKEN=<GITLAB_TOKEN>
flux bootstrap gitlab --owner=geraldwuhoo --repository=homelab-iac --branch=master --path=fluxcd/clusters/production --token-auth --personal
FluxCD will now automatically monitor changes to the repo and deploy them to the cluster.
This FluxCD infrastructure deploys the following to on-prem production:
ceph-csi
storageclasscert-manager
Descheduler
external-dns
keel.sh
kube-vip
cloud controller managersystem-upgrade-controller
velero
backup softwarecert-manager
system-upgrade-controller
velero
kubernetes-dashboard
send