kind-ver.deploy.md

KIND vSR Deploy/Validation
# Create a Cluster #

$ kind create cluster --name semantic-router
Creating cluster "semantic-router" ...
 ✓ Ensuring node image (kindest/node:v1.35.0) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-semantic-router"
You can now use your cluster with:

kubectl cluster-info --context kind-semantic-router

Have a nice day! 👋

# Deploy #

$ ./deploy/openshift/deploy-to-openshift.sh --kind --no-observability
[SUCCESS] Connected to cluster: kind-semantic-router
[INFO] Creating namespace: vllm-semantic-router-system
namespace/vllm-semantic-router-system created
[SUCCESS] Namespace ready
[INFO] KServe CRD not found - using standalone deployment mode
[INFO] Deploying standalone simulator pods...
deployment.apps/vllm-model-a created
deployment.apps/vllm-model-b created
service/vllm-model-a created
service/vllm-model-b created
[INFO] Waiting for simulator services to get ClusterIPs...
[SUCCESS] Got ClusterIPs: model-a=10.96.101.161, model-b=10.96.143.105
[INFO] Creating PersistentVolumeClaims...
persistentvolumeclaim/semantic-router-models created
persistentvolumeclaim/semantic-router-cache created
[SUCCESS] PVCs created
[INFO] Generating configuration...
[SUCCESS] Configuration generated
[INFO] Creating ConfigMaps...
configmap/semantic-router-config created
configmap/envoy-config created
[SUCCESS] ConfigMaps created
[INFO] Deploying semantic-router...
deployment.apps/semantic-router created
[SUCCESS] Semantic-router deployment applied
[INFO] Creating services...
service/semantic-router created
service/semantic-router-metrics created
[SUCCESS] Services created
[INFO] Waiting for deployments to be ready...
[INFO] This may take several minutes as models are downloaded...
Waiting for deployment "vllm-model-a" rollout to finish: 0 of 1 updated replicas are available...
deployment "vllm-model-a" successfully rolled out
deployment "vllm-model-b" successfully rolled out
Waiting for deployment "semantic-router" rollout to finish: 0 of 1 updated replicas are available...
deployment "semantic-router" successfully rolled out
[SUCCESS] Deployment complete!

==================================================
  Kind Deployment Summary
==================================================

Namespace: vllm-semantic-router-system

Access the services (run in a separate terminal):

  kubectl port-forward -n vllm-semantic-router-system svc/semantic-router 8080:8080 8801:8801

Then test:

  # Auto-routing (classifier picks the model)
  curl http://localhost:8801/v1/chat/completions \
    -H 'Content-Type: application/json' \
    -d '{"model": "auto", "messages": [{"role": "user", "content": "What is 2+2?"}]}'

  # STEM query -> routes to Model-A
  curl http://localhost:8801/v1/chat/completions \
    -H 'Content-Type: application/json' \
    -d '{"model": "auto", "messages": [{"role": "user", "content": "Explain quantum physics"}]}'

  # Humanities query -> routes to Model-B
  curl http://localhost:8801/v1/chat/completions \
    -H 'Content-Type: application/json' \
    -d '{"model": "auto", "messages": [{"role": "user", "content": "Explain the elements of a contract under common law and give a simple example."}]}'

View logs:
  kubectl logs -f deployment/semantic-router -c semantic-router -n vllm-semantic-router-system
  kubectl logs -f deployment/semantic-router -c envoy-proxy -n vllm-semantic-router-system

View status:
  kubectl get pods -n vllm-semantic-router-system
  kubectl get svc -n vllm-semantic-router-system

# Validation #

$ curl http://localhost:8801/v1/chat/completions \
    -H 'Content-Type: application/json' \
    -d '{"model": "auto", "messages": [{"role": "user", "content": "What is 2+2?"}]}'
{"id":"chatcmpl-431952b9-f369-4cd7-b398-c09f1425c774","created":1769578653,"model":"Model-A","usage":{"prompt_tokens":6,"completion_tokens":50,"total_tokens":56},"object":"chat.completion","do_remote_decode":false,"do_remote_prefill":false,"remote_block_ids":null,"remote_engine_id":"","remote_host":"","remote_port":0,"choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Testing@, #testing 1$ ,2%,3^, [4\u0026*5], 6~, 7-_ + (8 : 9) / \\ \u003c \u003e . Today it is partially cloudy and raining. The temperature here is "}}]}

$ curl http://localhost:8801/v1/chat/completions \
    -H 'Content-Type: application/json' \
    -d '{"model": "auto", "messages": [{"role": "user", "content": "Explain quantum physics"}]}'
{"id":"chatcmpl-2306494d-3681-481e-82b0-9e160b36d16c","created":1769578669,"model":"Model-A","usage":{"prompt_tokens":3,"completion_tokens":45,"total_tokens":48},"object":"chat.completion","do_remote_decode":false,"do_remote_prefill":false,"remote_block_ids":null,"remote_engine_id":"","remote_host":"","remote_port":0,"choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Alas, poor Yorick! I knew him, Horatio: A fellow of infinite jest The rest is silence.  Today it is partially cloudy and raining. Testing@, #testing 1$ ,2%,3^, [4"}}]}

$ curl http://localhost:8801/v1/chat/completions \
    -H 'Content-Type: application/json' \
    -d '{"model": "auto", "messages": [{"role": "user", "content": "Explain the elements of a contract under common law and give a simple example."}]}'
{"id":"chatcmpl-3f759710-4064-4143-b5b2-402398fbda6b","created":1769578677,"model":"Model-B","usage":{"prompt_tokens":15,"completion_tokens":25,"total_tokens":40},"object":"chat.completion","do_remote_decode":false,"do_remote_prefill":false,"remote_block_ids":null,"remote_engine_id":"","remote_host":"","remote_port":0,"choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Today it is partially cloudy and raining. The temperature here is twenty-five degrees centigrade. Today it is partially cloudy and raining"}}]}$
nerdalert/kind-ver.deploy.md

Select an option

No results found

Select an option

No results found