vLLM Emulator

Emulates a vLLM-served LLM, providing mock /v1/completions and /v1/chat/completions endpoints

Run locally

pip3 install -r requirements.txt
fastapi dev vllm_emulator

then you can curl the "model" as if it were an LLM, e.g.,:

curl -skv localhost:8000/v1/chat/completions  \
   -H "Content-Type: application/json" \
   -d "{
   \"model\": \"$LITERALLY_ANY_STRING_WORKS\",
   \"messages\": [
      {\"role\": \"user\", \"content\": \"Hi how are you?\"}],
   \"temperature\":0,
   \"logprobs\": true,
   \"max_tokens\":50
   }")

For the model argument, any string is accepted by the endpoint.

OpenShift Usage

oc apply -f deployment.yaml

This creates a service and route that can be used inside lm-eval, e.g.:

apiVersion: trustyai.opendatahub.io/v1alpha1
kind: LMEvalJob
metadata:
  name: evaljob
spec:
  model: local-completions
  taskList:
    taskNames:
      - arc_easy
  logSamples: true
  batchSize: '1'
  allowOnline: true
  allowCodeExecution: false
  outputs:
    pvcManaged:
      size: 5Gi
  modelArgs:
    - name: model
      value: emulatedModel
    - name: base_url
      value: http://vllm-emulator-service:8000/v1/completions
    - name: num_concurrent
      value:  "1"
    - name: max_retries
      value:  "3"
    - name: tokenized_requests
      value: "False"
    - name: tokenizer
      value: ibm-granite/granite-guardian-3.1-8b # this isn't used, but we need some valid value here

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Dockerfile		Dockerfile
README.md		README.md
api.py		api.py
bigrams.json		bigrams.json
deployment.yaml		deployment.yaml
requirements.txt		requirements.txt
text_generation.py		text_generation.py
vllm_emulator.py		vllm_emulator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vLLM Emulator

Run locally

OpenShift Usage

About

Releases

Packages

Languages

trustyai-explainability/vllm_emulator

Folders and files

Latest commit

History

Repository files navigation

vLLM Emulator

Run locally

OpenShift Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages