A Helm chart to delete pods stuck in a pending state without image pull secrets
As noted in the OpenShift documentation, service accounts must be fully provisioned with an image pull secret before being used by pods to pull images from the internal cluster registry. If pods are created before the image pull secret exists, the pod will be stuck in an ImagePullBackOff state until it is deleted. It is recommended to verify the service account is provisioned with secrets before creating deployments, but this chart is intended for cases where that is not feasible.
This chart is intended to be run during the installation of other application charts on the cluster. The chart will create RBAC resources and a single-pod deployment using the openshift4/ose-cli
image. The pod runs a Bash script that uses kubectl to identify pods stuck pending without image pull secrets, and deletes them. This script repeats at an interval configurable by the waitIntervalSeconds
value.
reconciler.sh is a Bash script that deletes all pods in a Pending state with no image pull secrets and repeats at a configurable interval until killed. The script has one variable, WAIT_INTERVAL
, which is read from the environment with a default of 30 seconds. The pod auto-discovers its Kubeconfig from the pod environment. This script is stored as a ConfigMap in the cluster.
Bash was chosen so the chart would need minimal, if any, modifications for future versions of OpenShift. The oc
binary is provided by the registry.redhat.io/openshift4/ose-cli
container image and is pinned to a minor release of OpenShift. Future upgrades are intended to be as simple as bumping the image.tag
chart value to the new minor release (e.g. v4.8
to v4.10
).
This chart creates a cluster role with permission to get and delete all pods across the cluster and binds it to the service account created by the chart.
Optionally, mirrorRBAC
can be enabled via chart values to allow the ose-cli
image to be mirrored locally and pulled without credentials. This prevents the chart from facing the same race condition it was created to mitigate.
This chart include a single-replica deployment with a pod perpetually running this script. When application charts are finished installing, the reconciler script can be stopped by uninstalling this chart from the cluster.
This chart can be installed directly from a release archive or by cloning this repo locally.
$ helm install -n pod-reconciler --create-namespace pod-reconciler ./pod-pullsecret-reconciler.tgz