Skip to content

Instantly share code, notes, and snippets.

View nerdalert's full-sized avatar
🐈
🦀 🐿

Brent Salisbury nerdalert

🐈
🦀 🐿
View GitHub Profile

Validated Backend Control Plane Prototype #31

Output from deploying: kubernetes-sigs/wg-ai-gateway#31

$> curl -s http://172.18.255.240/v1/models | jq
{
  "object": "list",
  "data": [
    {

TokenRateLimitPolicy demo output

Applies TRLP to the MaaS gateway for the vSR https://${MAAS_HOST}/v1/chat/completions route.

$> export MAAS_HOST="maas.$(oc get ingresses.config.openshift.io cluster -o jsonpath='{.spec.domain}')"

  export ACCESS_TOKEN=$(curl -sSk --oauth2-bearer "$(oc whoami -t)" \
    --json '{"expiration": "10m"}' \
    "https://${MAAS_HOST}/maas-api/v1/tokens" | jq -r .token)

Classifier on GPU deploy support stdout

$ ./deploy/openshift/deploy-to-openshift.sh --kserve --simulator --classifier-gpu
[SUCCESS] Logged in as cluster-admin
[INFO] Creating namespace: vllm-semantic-router-system
namespace/vllm-semantic-router-system configured
[SUCCESS] Namespace ready
[INFO] Installing KServe and LLMInferenceService CRDs...
[INFO] InferenceService CRD already installed.

vSR LlmInferenceServices Kserve GPU Demo

$ ./deploy/openshift/deploy-to-openshift.sh --kserve --no-observability
[SUCCESS] Logged in as cluster-admin
[INFO] Creating namespace: vllm-semantic-router-system
namespace/vllm-semantic-router-system configured
[SUCCESS] Namespace ready
[INFO] Installing KServe and LLMInferenceService CRDs...
[INFO] InferenceService CRD already installed.

vSR LlmInferenceServices Kserve Simulator Demo

$ ./deploy/openshift/deploy-to-openshift.sh --kserve --simulator --no-observability
[SUCCESS] Logged in as cluster-admin
[INFO] Creating namespace: vllm-semantic-router-system
namespace/vllm-semantic-router-system configured
[SUCCESS] Namespace ready
[INFO] Installing KServe and LLMInferenceService CRDs...

KIND vSR Deploy/Validation

# Create a Cluster #

$ kind create cluster --name semantic-router
Creating cluster "semantic-router" ...
 ✓ Ensuring node image (kindest/node:v1.35.0) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜

Deploy and Validation stdout of vSR/Kserve/multi-model sim.

$ ./deploy/openshift/deploy-to-openshift.sh --kserve --simulator
[SUCCESS] Logged in as kube:admin
[INFO] Creating namespace: vllm-semantic-router-system
namespace/vllm-semantic-router-system configured
[SUCCESS] Namespace ready
[INFO] KServe CRD missing; installing KServe dependencies...
[INFO] cert-manager CRDs already present.

stdout testing for opendatahub-io/models-as-a-service#227

brent@ip-172-31-33-128:~/tls/opendatahub-operator$ make install deploy -e VERSION=tls -e IMG='quay.io/bmajsak/opendatahub-operator:tls'
go: downloading go1.25.0 (linux/amd64)
mkdir -p /home/brent/tls/opendatahub-operator/bin
Downloading sigs.k8s.io/kustomize/kustomize/v5@v5.7.0
Downloading sigs.k8s.io/controller-tools/cmd/controller-gen@v0.17.3
/home/brent/tls/opendatahub-operator/bin/controller-gen --load-build-tags=odh rbac:roleName=controller-manager-role crd:ignoreUnexportedFields=true webhook paths="./..." output:crd:artifacts:config=config/crd/bases output:rbac:artifacts:config=config/rbac output:webhook:artifacts:config=config/webhook
/home/brent/tls/opendatahub-operator

Deploy

  • Had to run the script twice to chmod a new install script breaking out the IDP install. No changes to logic, just calling a different backend script.
$ export ENABLE_KEYCLOAK_IDP=true
  ./scripts/deploy-rhoai-stable.sh
## Installing prerequisites
$ ./deploy/openshift/deploy-to-openshift.sh
[SUCCESS] Logged in as kube:admin
[INFO] Creating namespace: vllm-semantic-router-system
namespace/vllm-semantic-router-system configured
[SUCCESS] Namespace ready
[INFO] Checking for llm-katan image...
[INFO] Building llm-katan image...
--> Found container image ce19342 (6 days old) from Docker Hub for "python:3.10-slim"