Terraform RKE2 OpenStack

Easily deploy a high-availability RKE2 Kubernetes cluster on OpenStack providers (e.g. Infomaniak, OVH, etc.). This project aims at offering a simple and stable distribution rather than supporting all configuration possibilities.

Inspired and reworked from remche/terraform-openstack-rke2 to add an easier interface, high-availability, load-balancing and sensible defaults for running production workload.

Features

RKE2 Kubernetes distribution : lightweight, stable, simple and secure
persisted /var/lib/rancher/rke2 when there is a single server
automated etcd snapshots with Openstack Swift support or other S3-like backend
smooth updates & agent nodes autoremoval with pod draining
integrated Openstack Cloud Controller (load-balancer, etc.) and Cinder CSI
Cilium networking (network policy support and no kube-proxy)
highly-available via kube-vip and dynamic peering (no load-balancer required)
out of the box support for volume snapshot and Velero

Versioning

Component	Version
OpenStack	2023.1 Antelope (verified), maybe older version are supported too
RKE2	v1.29.0+rke2r1
OpenStack Cloud Controller	v1.28.1
OpenStack Cinder	v1.28.1
Velero	v6.0.0
Kube-vip	v0.7.2

Getting started

git clone git@github.com:zifeo/terraform-openstack-rke2.git && cd terraform-openstack-rke2/examples/single-server
cat <<EOF > terraform.tfvars
project=PCP-XXXXXXXX
username=PCU-XXXXXXXX
password=XXXXXXXX
EOF

terraform init
terraform apply # approx 2-3 mins
kubectl --kubeconfig single-server.rke2.yaml get nodes
# NAME           STATUS   ROLES                       AGE     VERSION
# k8s-pool-a-1   Ready    <none>                      119s    v1.21.5+rke2r2
# k8s-server-1   Ready    control-plane,etcd,master   2m22s   v1.21.5+rke2r2

# get SSH and restore helpers
terraform output -json

# on upgrade, process node pool by node pool
terraform apply -target='module.rke2.module.servers["server-a"]'

See examples for more options or this article for a step-by-step tutorial.

Note: it requires rsync and yq to generate remote kubeconfig file. You can disable this behavior by setting ff_write_kubeconfig=false and fetch yourself /etc/rancher/rke2/rke2.yaml on server nodes.

Restoring a backup

# remove server url from rke2 config
sudo vim /etc/rancher/rke2/config.yaml
# ssh into one of the server nodes (see terraform output -json)
# restore s3 snapshot (see restore_cmd output of the terraform module):
sudo systemctl stop rke2-server
sudo rke2 server --cluster-reset --etcd-s3 --etcd-s3-bucket=BUCKET_NAME --etcd-s3-access-key=ACCESS_KEY --etcd-s3-secret-key=SECRET_KEY --cluster-reset-restore-path=SNAPSHOT_PATH
sudo systemctl start rke2-server
# exit and ssh on the other server nodes to remove the etcd db
# (recall that you may need to ssh into one node as a bastion then to the others):
sudo systemctl stop rke2-server
sudo rm -rf /var/lib/rancher/rke2/server
sudo systemctl start rke2-server
# reboot all nodes one by one to make sure all is stable
sudo reboot

Infomaniak OpenStack

A stable, performant and fully equipped Kubernetes cluster in Switzerland for as little as CHF 18.—/month (at the time of writing):

1 server 2cpu/4Go (= master)
1 agent 1cpu/2Go (= worker)
1 floating IP for admin access (ssh and kubernetes api)
1 floating IP for private network gateway

Flavour	CHF/month
5.88 + 2.93 (instances) + 0.09×2×(6+8) (block storage) + 2×3.34 (IP)	18.—
1x2cpu/4go server with 1x4cpu/16Go worker	~28.—
3x2cpu/4go HA servers with 1x4cpu/16Go worker	~41.—
3x2cpu/4go HA servers with 3x4cpu/16Go workers	~76.—

You may also want to add a load-balancer and bind an additional floating IP for public access (e.g. for an ingress controller like ingress-nginx), that will add 10.00 (load-balancer) + 3.34 (IP) = CHF 13.34/month. Note that physical load-balancer can be shared by many Kubernetes load-balancers when there is no port collision.

See their technical documentation and pricing.

More on RKE2 & OpenStack

RKE2 cheat sheet

# alias already set on the nodes
crictl
kubectl (server only)

# logs
sudo systemctl status rke2-server.service
journalctl -f -u rke2-server

sudo systemctl status rke2-agent.service
journalctl -f -u rke2-agent

less /var/lib/rancher/rke2/agent/logs/kubelet.log
less /var/lib/rancher/rke2/agent/containerd/containerd.log
less /var/log/cloud-init-output.log

# check san
openssl s_client -connect 192.168.42.3:10250 </dev/null 2>/dev/null | openssl x509 -inform pem -text

# defrag etcd
kubectl -n kube-system exec $(kubectl -n kube-system get pod -l component=etcd --no-headers -o custom-columns=NAME:.metadata.name | head -1) -- sh -c "ETCDCTL_ENDPOINTS='https://127.0.0.1:2379' ETCDCTL_CACERT='/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/rke2/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/rke2/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl defrag --cluster"

# increase volume size
# shutdown instance
# detach volumne
# expand volume
# recreate node
terraform apply -target='module.rke2.module.servers["server"]' -replace='module.rke2.module.servers["server"].openstack_compute_instance_v2.instance[0]'

Migration guide

From v2 to v3

# 1. use the previous patch version (2.0.7) to setup an additional san for 192.168.42.4
# this will become the new VIP inside the cluster and replace the load-balancer:
source  = "zifeo/rke2/openstack"
version = "2.0.7"
# ...
additional_san = ["192.168.42.4"]
# 2. run an full upgrade with it, node by node:
terraform apply -target='module.rke2.module.servers["your-server-pool"]'
# 3. you can now switch to the new major and remove the additional_san:
source  = "zifeo/rke2/openstack"
version = "3.0.0"
# 4. create the new external IP for admin access (that will be different from the load-balancer) with:
terraform apply -target='module.rke2.openstack_networking_floatingip_associate_v2.fip'
# 5. pick a server different from the initial one (used to bootstrap):
terraform apply -target='module.rke2.module.servers["server-c"].openstack_networking_port_v2.port'
# 6. give to that server the control of the VIP
ssh ubuntu@server-c
sudo su
modprobe ip_vs
modprobe ip_vs_rr
cat <<EOF > /var/lib/rancher/rke2/agent/pod-manifests/kube-vip.yaml
apiVersion: v1
kind: Pod
metadata:
  name: kube-vip
  namespace: kube-system
spec:
  containers:
    - name: kube-vip
      image: ghcr.io/kube-vip/kube-vip:v0.7.2
      imagePullPolicy: IfNotPresent
      args:
        - manager
      env:
        - name: vip_arp
          value: "true"
        - name: port
          value: "6443"
        - name: vip_interface
          value: ens3
        - name: vip_cidr
          value: "32"
        - name: cp_enable
          value: "true"
        - name: cp_namespace
          value: kube-system
        - name: vip_ddns
          value: "false"
        - name: svc_enable
          value: "false"
        - name: vip_leaderelection
          value: "true"
        - name: vip_leasename
          value: plndr-cp-lock
        - name: vip_leaseduration
          value: "15"
        - name: vip_renewdeadline
          value: "10"
        - name: vip_retryperiod
          value: "2"
        - name: enable_node_labeling
          value: "true"
        - name: lb_enable
          value: "true"
        - name: lb_port
          value: "6443"
        - name: lb_fwdmethod
          value: local
        - name: address
          value: 192.168.42.4
        - name: prometheus_server
          value: ":2112"
      resources:
        requests:
          cpu: 100m
          memory: 64Mi
        limits:
          memory: 64Mi
      securityContext:
        capabilities:
          add:
            - NET_ADMIN
            - NET_RAW
      volumeMounts:
        - mountPath: /etc/kubernetes/admin.conf
          name: kubeconfig
  restartPolicy: Always
  hostAliases:
    - hostnames:
        - kubernetes
      ip: 127.0.0.1
  hostNetwork: true
  volumes:
    - name: kubeconfig
      hostPath:
        path: /etc/rancher/rke2/rke2.yaml
EOF
# 7. you should see a pod in kube-system starting with kube-vip (investigate if failling)
# then apply the migration to the initial/bootstraping server:
terraform apply -target='module.rke2.module.servers["server-a"]'
terraform apply -target='module.rke2.openstack_networking_secgroup_rule_v2.outside_servers'
# 8. the cluster IP has now changed, and you should update your kubeconfig with the new ip (look in horizon)
# 9. import the load-balancer and its ip elsewhere if used (otherwise they will be destroyed)
cat <<EOF > lb.tf
resource "openstack_lb_loadbalancer_v2" "lb" {
  name                  = "lb"
  vip_network_id        = module.rke2.network_id
  vip_subnet_id         = module.rke2.lb_subnet_id
  lifecycle {
    ignore_changes = [
      tags
    ]
  }
}
resource "openstack_networking_floatingip_v2" "external" {
  pool    = "ext-floating1"
  port_id = openstack_lb_loadbalancer_v2.lb.vip_port_id
}
EOF
terraform state show module.rke2.openstack_lb_loadbalancer_v2.lb
terraform import openstack_lb_loadbalancer_v2.lb ID
terraform state rm module.rke2.openstack_lb_loadbalancer_v2.lb
terraform state show module.rke2.openstack_networking_floatingip_v2.external
terraform import openstack_networking_floatingip_v2.external ID
terraform state rm module.rke2.openstack_networking_floatingip_v2.external
# 10. continues by upgrading other nodes step-by-step as you would do it normally:
terraform apply -target='module.rke2.module.POOL["NODE"]'
# 11. once all the nodes are upgraded, make sure that everything is well applied:
terraform apply

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
.github/workflows		.github/workflows
examples		examples
manifests		manifests
node		node
patches		patches
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.tflint.hcl		.tflint.hcl
LICENSE		LICENSE
README.md		README.md
keys.tf		keys.tf
main.tf		main.tf
network.tf		network.tf
outputs.tf		outputs.tf
s3.tf		s3.tf
secgroup.tf		secgroup.tf
variables.tf		variables.tf
versions.tf		versions.tf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Terraform RKE2 OpenStack

Features

Versioning

Getting started

Restoring a backup

Infomaniak OpenStack

More on RKE2 & OpenStack

Migration guide

From v2 to v3

About

Releases 22

Contributors 4

Languages

License

zifeo/terraform-openstack-rke2

Folders and files

Latest commit

History

Repository files navigation

Terraform RKE2 OpenStack

Features

Versioning

Getting started

Restoring a backup

Infomaniak OpenStack

More on RKE2 & OpenStack

Migration guide

From v2 to v3

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 22

Contributors 4

Languages