- Goals
- Components
- Prerequisites
- Warning
- The Guide
- Usage
install.sh
- Rebuilding a failed node
- How to build the Debian FAI CD
- Limitations
- Known Issues
- Contributing
- License
- Contact
- Acknowledgements
kaymeg is a set of scripts that combines k3s (kay), Metal LB (m), etcd (e) and GlusterFS (g) to form a simple, lightweight, cheap to build & run, bare metal, high availability Kubernetes cluster.
Most of the installation and configuration is automated, allowing for fast & repeatable provisioning in cloud and bare metal environments.
- Run on minimum specificiation hardware / cloud resources
- Run on a minimum of three nodes
- High availability (without the need for a dedicated load balancer)
- ✅ Debian 10.4
- ✅ k3s (compiled with GlusterFS support, leveraging k3s-glusterfs)
- ✅ External etcd (should not be needed once k3s is certified with embedded etcd support)
- ✅ GlusterFS
- ✅ NFS access to GlusterFS volumes (via Ganesha)
- ✅ Kubernetes Dashboard
- ✅ MetalLB
- Three available nodes (VMs or bare metal)
- each node to have two disks:
/dev/sda
for the OS and programs,/dev/sdb
for data
- each node to have two disks:
- Debian netinst installation media
- An internet connection
There are three steps to getting up and running:
- Base Operating System (Debian) install (this is the hardest bit!)
- Run
install.sh
script - Use k8s!
That's it!
Each node is based on a minimal install of Debian. If you are in a virtual or cloud environment then you may wish to consider snapshotting your node at the end of this step, in order to facilitate rapid testing cycles.
There are several options available to you, for example DHCP, or static (the details of which are outside the scope of this document).
Whichever way you choose to go, you must end up with:
- An IP address for each node that does not change (not necessarily static, it can still be issued dynamically)
- A DNS resolvable name
It might make sense to select some names and addresses that are easy to remember, such as 10.8.8.1
, 10.8.8.2
, 10.8.8.3
, and k8s-server1
, k8s-server2
, k8s-server3
.
Follow these steps to setup a base installation of Debian.
Note: this procedure assumes that you will set hostname and IP address via DHCP
- Download fai-kaymeg.iso, and burn this to a USB drive, or CD.
- This is built using a fai config customised for kaymeg
- Boot the machine from the ISO, and select
Client standalone installation
- Select Kaymeg from the FAI profile menu
- Confirm you are ok with disks being erased
- Wait for installation to complete, and confirm to reboot
The default root password is k8s
-- make sure this is changed as soon as possible.
After Debian has booted for the first time, run the following command to export your public key to the instance:
ssh root@<server-nanme> "mkdir ~/.ssh && echo `cat ~/.ssh/id_rsa.pub` > ~/.ssh/authorized_keys"
At this stage, if you are using virtualised infrastructure, you probably want to shutdown your instance and take a snapshot, as from here things are more automated
- Clone this repo:
git clone https://github.com/cjrpriest/k3s-etcd-glusterfs-metallb
- Execute
install.sh
(see below for usage)
That's it, you're done!
install.sh SERVER1_DETAILS SERVER2_DETAILS SERVER3_DETAILS LB_RANGE_START LB_RANGE_END
Argument | Description | Example |
---|---|---|
SERVER1_DETAILS |
DNS name and IP address of 1st server, in the format dns_name:ip_address |
k8s-server1:10.8.8.1 |
SERVER2_DETAILS |
DNS name and IP address of 1st server, in the format dns_name:ip_address |
k8s-server2:10.8.8.2 |
SERVER3_DETAILS |
DNS name and IP address of 1st server, in the format dns_name:ip_address |
k8s-server3:10.8.8.3 |
LB_RANGE_START |
The first IP address in the range available for the load balancer to use | 10.8.8.10 |
LB_RANGE_END |
The last IP address in the range available for the load balancer to use | 10.8.8.20 |
Example Usage:
./install.sh k8s-server1:10.8.8.1 k8s-server2:10.8.8.2 k8s-server3:10.8.8.3 10.8.8.10 10.8.8.20
What if one of the nodes has failed, and it is unrecoverable?
This procedure assumes that you wish to (re)build a node with the same name & IP address etc. There is no requirement that the node is rebuilt on the same hardware.
- Remove the failed node from the gluster cluster. From a working node, execute:
gluster volume info
to verify that there are three recognised bricks in the clustergluster volume remove-brick gv0 replica 2 <failed-node-name>:/data/brick1/gv0 force
to remove the failed node's brick from the clustergluster volume info
to verify that there are now two recognised bricks in the clustergluster peer status
to verify that the failed node is recognised asDisconnected
gluster peer detach <failed-node-name>
to detach the failed node from the gluster clustergluster peer status
to verify that the failed node is no longer present
- Install base operating system (see Install Debian & Post-install setup) on the failed node
- Run the install script with an additional parameter, which instructs it to only setup and configure that node:
./install.sh k8s-server1:10.8.8.1 k8s-server2:10.8.8.2 k8s-server3:10.8.8.3 10.8.8.10 10.8.8.20 <failed-node-name>
- Re-join the (now rebuilt) node to the gluster cluster. From a (previously) working node, execute:
gluster peer probe <failed-node-name>
to re-attach the node the clustergluster peer status
to verify that the failed node is recognised asConnected
gluster volume add-brick gv0 replica 3 k8s-server2:/data/brick1/gv0
to re-add the rebuilt node's brick to the clustergluster volume info
to verify that there are now three recognised bricks in the cluster
- Remove the failed node from etcd. From a (previously) working node, execute:
etcdctl -C http://<this-nodes-ip-address>:2379 cluster-health
to determine the member id if the failed nodeetcdctl -C http://<this-nodes-ip-address>:2379 member remove <id-of-failed-node>
etcdctl -C http://<this-nodes-ip-address>:2379 member add <failed-node-name> http://<failed-node-ip-address>:2380
- On the failed node, execute:
- Replace
# ETCD_INITIAL_CLUSTER_STATE="new"
withETCD_INITIAL_CLUSTER_STATE="existing"
in/etc/default/etcd
such that the node knows that it should rejoin the existing cluster systemctl restart etcd
to restart etcd
- Replace
- From a (previously) working node, execute
etcdctl -C http://<this-nodes-ip-address>:2379 cluster-health
to verify that the cluster is in good health
In a Debian based linux distro:
wget -O - https://fai-project.org/download/2BF8D9FE074BCDE4.asc | apt-key add -
to add the FAI project public key to your instance of aptecho "deb http://fai-project.org/download buster koeln" > /etc/apt/sources.list.d/fai.list
to add the FAI project sourceapt-get update
to update aptaptitude install fai-quickstart
to install the minimum FAI project binariesfai-mk-configspace
to create the default configurationfai-make-nfsroot
to create the default NFS root datarm -Rf /srv/fai/config
to delete the example config filesgit clone https://github.com/cjrpriest/kaymeg-fai-config /srv/fai/config/
to add the Kaymeg configfai-cd -C /etc/fai -M fai.iso
to build the ISO
As we cannot replicate a true external load balancer, then there are some limitations. Notably, they are:
- Slow / broken failover: MetalLB relies on clients to change the MAC address that they are sending traffic to, once a failure occurs. This isn't completely bug free, but should be fine in modern OSes and devices
- Single node bottlenecking: Clients will always send all traffic for a service to one node, and MetalLB distributes this internally within the cluster. This could (theoretically) result in a network bottleneck. However, unless your use case involves each node processing data that is more than a third of the networking capacity of a single node (unlikely), then you are probably going to be ok.
These limitations are described in more detail over in the MetalLB documentation
- When a node is rebuilt (according the above procedure) NFS doesn't appear to work. A reboot of the node fixes this.
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Chris Priest - @cjrpriest Project Link: https://github.com/cjrpriest/kaymeg