This project is a catalog of configurations used to provision infrastructure, on OpenShift, that supports machine learning (ML) and artificial intelligence (AI) workloads.
The intention of this repository is to help support practical use of OpenShift for AI / ML workloads and provide a catalog of configurations / demos / workshops.
Please look at the GitOps Catalog if you only need to automate an operator install.
In this repo, look at various kustomized configs and argo apps for ideas.
For issues with oc apply -k
see the known issues section below.
- OpenShift 4.14+
Red Hat Demo Platform Options (Tested)
NOTE: The node sizes below are the recommended minimum to select for provisioning
- AWS with OpenShift Open Environment
- 1 x Control Plane -
m6a.2xlarge
- 0 x Workers -
m6a.2xlarge
- 1 x Control Plane -
- One Node OpenShift
- 1 x Control Plane -
m6a.2xlarge
- 1 x Control Plane -
- MLOps Demo: Data Science & Edge Practice
Install the OpenShift Web Terminal
The following icon should appear in the top right of the OpenShift web console after you have installed the operator. Clicking this icon launches the web terminal.
NOTE: Reload the page in your browser if you do not see the icon after installing the operator.
# bootstrap the enhanced web terminal
YOLO_URL=https://raw.githubusercontent.com/redhat-na-ssa/demo-ai-gitops-catalog/main/scripts/library/term.sh
. <(curl -s "${YOLO_URL}")
term_init
NOTE: open a new terminal to full activate the new configuration
- Verify you are logged into your cluster using
oc
. - Clone this repository
NOTE: See the tools section below for more info
# verify oc login
oc whoami
# git clone this repo
git clone https://github.com/redhat-na-ssa/demo-ai-gitops-catalog
cd demo-ai-gitops-catalog
# load functions into a bash shell
. scripts/functions.sh
Setup basic cluster config
# load functions
. scripts/functions.sh
# setup a persistent enhanced web terminal on a default cluster
apply_firmly bootstrap/install-web-terminal
# setup a default cluster w/o argocd managing it
apply_firmly clusters/default
Setup a demo
# setup a dev spaces demo /w gpu
apply_firmly demos/devspaces-nvidia-gpu-autoscale
# setup a rhoai demo /w gpu
apply_firmly demos/rhoai-nvidia-gpu-autoscale
Running scripts/bootstrap.sh
will allow you to select common options. This is a work in progress.
This script handles configurations that are not fully declarative, require imperative steps, or require user interaction.
Various kustomized app configs and cluster configs can be applied individually.
Operator installs can be done quickly via oc
- similar to the GitOps Catalog.
oc apply -k
and apply_firmly
can be used interchangeably in the examples below:
# setup htpasswd based login
oc apply -k components/cluster-configs/login/overlays/htpasswd
# disable self provisioner in cluster
oc apply -k components/cluster-configs/rbac/overlays/no-self-provisioner
# install minio w/ minio namespace
oc apply -k components/app-configs/minio/overlays/with-namespace
# install the nfs provisioner
oc apply -k components/app-configs/nfs-provisioner/overlays/default
Examples with operators that require CRDs
# setup serverless w/ instance
apply_firmly components/operators/serverless-operator/aggregate/default
# setup acs with a minimal configuration
apply_firmly components/operators/rhacs-operator/aggregate/minimal
Common operational tasks are provided in the scripts library. You can run individual functions in a bash
shell:
NOTE: These functions are available in an enhanced web terminal - see install above
# load functions
. scripts/functions.sh
get_functions
This is currently under development
# load functions
. scripts/wip/workshop_functions.sh
# setup workshop with 25 users
workshop_setup 25
oc apply -k
commands may fail on the first try.
This is inherent to how Kubernetes handles custom resources (CR) - A CR must be created after it has been defined via a custom resource definition (CRD).
The solution... re-run the command until it succeeds.
The function apply_firmly
is interchangeable with oc apply -k
and is similar to the following shell command:
until oc apply -k < path to kustomization.yaml >; do : ; done
This repo is currently subject to frequent, breaking changes!
Always reference with a commit hash or tag
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://github.com/redhat-na-ssa/demo-ai-gitops-catalog/components/app-configs/nvidia-gpu-verification/overlays/toleration-replicas-6?ref=v0.09
The following cli tools are required:
bash
,git
oc
- Download mac, linux, windowskubectl
(optional) - Included inoc
bundlekustomize
(optional) - Download mac, linux
NOTE: bash
, git
, and oc
are available in the OpenShift Web Terminal
The following are used to encrypt secrets and are optional:
Please run the following before submitting a PR / commit
scripts/lint.sh