Resources

It is now possible to configure resources in the YAML configuration file:

type: dev-environment

python: 3.11
ide: vscode

# (Optional) Configure `gpu`, `memory`, `disk`, etc 
resources:
  gpu: 24GB

Supported properties include: gpu, cpu, memory, disk, and shm_size.

If you specify memory size, you can either specify an explicit size (e.g. 24GB) or a
range (e.g. 24GB.., or 24GB..80GB, or ..80GB).

The gpu property allows specifying not only memory size but also GPU names
and their quantity. Examples: A100 (one A100), A10G,A100 (either A10G or A100),
A100:80GB (one A100 of 80GB), A100:2 (two A100), 24GB..40GB:2 (two GPUs between 24GB and 40GB), etc.

Authorization in services

Service endpoints now require the Authentication header with "Bearer <dstack token>". This also includes the OpenAI-compatible endpoints.

from openai import OpenAI


client = OpenAI(
  base_url="https://gateway.example.com",
  api_key="<dstack token>"
)

completion = client.chat.completions.create(
  model="mistralai/Mistral-7B-Instruct-v0.1",
  messages=[
    {"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."}
  ]
)

print(completion.choices[0].message)

Authentication can be disabled by setting auth to false in the service configuration file.

OpenAI format in model mapping

Model mapping (required to enable OpenAI interact) now supports format: openai.

For example, if you run vLLM using the OpenAI mode, it's possible to configure model mapping for it.

type: service

python: "3.11"
env:
  - MODEL=NousResearch/Llama-2-7b-chat-hf
commands:
  - pip install vllm
  - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
port: 8000

resources:
  gpu: 24GB

model:
  format: openai
  type: chat
  name: NousResearch/Llama-2-7b-chat-hf

What's changed

Configuration resources & ranges by @Egor-S in #844
Range.str always returns a string by @Egor-S in #845
Add infinity example by @deep-diver in #847
error in documentation: use --url instead of --server by @promsoft in #852
Support authorization on the gateway by @Egor-S in #851
Implement Kubernetes backend by @r4victor in #853
Add gpu support for kubernetes by @r4victor in #856
Resources parse and store by @Egor-S in #857
Use python3.11 in generate-json-schema by @r4victor in #859
Implement OpenAI to OpenAI adapter for gateway by @Egor-S in #860

New contributors

@deep-diver made their first contribution in #847
@promsoft made their first contribution in #852

Full Changelog: 0.14.0...0.15.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.15.0

Resources

Authorization in services

OpenAI format in model mapping

What's changed

New contributors

Contributors