Skip to content

Latest commit

 

History

History
116 lines (76 loc) · 5.73 KB

README.md

File metadata and controls

116 lines (76 loc) · 5.73 KB

MLOps and compute acceleration with Openshift AI

Leverage Openshift AI (RHOAI) to create and deploy anywhere a fedora detection model. Use MLOps to automate the model training and deployment. Accelerate compute and inference using a GPU. This demo has been recorded during a Red Hat EMEA Open Demo and is available here.

Table of contents

  1. Highlights
  2. Architecture
  3. Deployment
  4. Sources

Highlights

  • JupyterLab environement as a service
  • Train a fedora detection model using YOLOv5
  • Create a data science pipeline
  • Serve and consume the model
  • Accelerate compute and inference using a shared GPU
  • MLOps using git as a single source of truth to trigger training and deploy the right model version in production
  • Deploy at the edge using a Raspberry Pi and Microshift

Architecture

full_arch

Deployment

Three clusters are used for this demo:

  • A OpenShift Container Platform cluster with a node containing a Nvidia GPU. GPU is optionnal
  • A Single Node Openshift (SNO) on an ARM architecture to build container images with the embeded model
  • Microshift as the edge container orchestration platform

The OpenShift Container Platform (OCP) cluster

Prerequisites

  • An OpenShift Container Platform cluster 4.12 or greater
  • (Optional) A node with a GPU is available. A machineset example is available if your cluster if running on AWS. You can run this demo without it.

Customization

Adapt this pipeline manifest to fit your environement. Search for the keyword CHANGEME.
Adapt the git webhook so that the container build on your arm cluster is automatically trigger on the pull request approval. Search for the keyword CHANGEME. You will need the domain of the SNO cluster but you can change it later.

Run

GPU

Wait for succeded installation between each of these commands:

oc apply -k ./cluster/operators
# Wait for operators installation
oc apply -k ./cluster/operators-instances
# Wait for operators instances installation
oc kustomize ./cluster/instances/ --enable-helm | oc apply -f -
# Wait for instances

Check this document to test the GPU integration

CPU only

Wait for succeded installation between each of these commands:

oc apply -k ./cluster/operators/base
# Wait for operators installation
oc apply -k ./cluster/operators-instances/base
# Wait for operators instances installation
oc kustomize ./cluster/instances/ --enable-helm | oc apply -f -
# Wait for instances

Jupyter Notebooks

You have to manually provision and configure the jupyterlab environement. Follow those steps.

The Single Node Openshift (SNO)

Prerequisites

  • An OpenShift Container Platform cluster 4.13 or greater

Customization

Adapt this pipeline manifest to fit your environement. Search for the keyword CHANGEME

Run

oc apply -k ./sno/operators
# Wait for the operators to complete installation
oc apply -k ./sno/instances

At the edge with Microshift on a Rapsberry Pi

We will assume that the edge device is connected to internet and is able to pull from a specific contianer registry. In a disconnected / offline envirnoment you may want to embed your application within the OS. Deployment options are described in the documentation.

Disclaimer: Neither the operating system nor the Raspberry Pi is supported by Red Hat. To enable support, you will need a RHEL 9.2 or greater as the OS, a supported bare metal hardware or a supported hypervisor.

  1. OS installation: Fedora IoT is used as the OS. An installation guide is available here.
  2. Microshift deployment: This guide can be used to deploy microshift once the OS is properly installed.
  3. Application deployment: Deploy the application that streams a wired camera and that has the model embeded. Adapt this manifest replacing the deployment image with the one created from your SNO cluster. Search for the keyword CHANGEME. Run oc apply -f ./edge/app.yaml on your edge device to deploy the application.

The deployment uses an image pull policy to deploy the new versions of the app. It pulls a new version every time the image digest changes. Further work can be done here to automatically updates the pods on a new push to the container registry.

Sources

Location Description Pull address
browser-fedora-detection Frontend that streams your webcam and does inference using GRPC protocol quay.io/alegros/browser-fedora-detection:v1
edge-detection Flask app that connects to a device camera with the model embeded to predit streams locally quay.io/alegros/remote-camera-detection:aarch64