Releases: dstackai/dstack
0.15.0
Resources
It is now possible to configure resources in the YAML configuration file:
type: dev-environment
python: 3.11
ide: vscode
# (Optional) Configure `gpu`, `memory`, `disk`, etc
resources:
gpu: 24GB
Supported properties include: gpu
, cpu
, memory
, disk
, and shm_size
.
If you specify memory size, you can either specify an explicit size (e.g. 24GB
) or a
range (e.g. 24GB..
, or 24GB..80GB
, or ..80GB
).
The gpu
property allows specifying not only memory size but also GPU names
and their quantity. Examples: A100
(one A100), A10G,A100
(either A10G or A100),
A100:80GB
(one A100 of 80GB), A100:2
(two A100), 24GB..40GB:2
(two GPUs between 24GB and 40GB), etc.
Authorization in services
Service endpoints now require the Authentication
header with "Bearer <dstack token>"
. This also includes the OpenAI-compatible endpoints.
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.example.com",
api_key="<dstack token>"
)
completion = client.chat.completions.create(
model="mistralai/Mistral-7B-Instruct-v0.1",
messages=[
{"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."}
]
)
print(completion.choices[0].message)
Authentication can be disabled by setting auth
to false
in the service configuration file.
OpenAI format in model mapping
Model mapping (required to enable OpenAI interact) now supports format: openai
.
For example, if you run vLLM using the OpenAI mode, it's possible to configure model mapping for it.
type: service
python: "3.11"
env:
- MODEL=NousResearch/Llama-2-7b-chat-hf
commands:
- pip install vllm
- python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
port: 8000
resources:
gpu: 24GB
model:
format: openai
type: chat
name: NousResearch/Llama-2-7b-chat-hf
What's changed
- Configuration resources & ranges by @Egor-S in #844
- Range.str always returns a string by @Egor-S in #845
- Add infinity example by @deep-diver in #847
- error in documentation: use --url instead of --server by @promsoft in #852
- Support authorization on the gateway by @Egor-S in #851
- Implement Kubernetes backend by @r4victor in #853
- Add gpu support for kubernetes by @r4victor in #856
- Resources parse and store by @Egor-S in #857
- Use python3.11 in generate-json-schema by @r4victor in #859
- Implement OpenAI to OpenAI adapter for gateway by @Egor-S in #860
New contributors
- @deep-diver made their first contribution in #847
- @promsoft made their first contribution in #852
Full Changelog: 0.14.0...0.15.0
0.14.0
OpenAI-compatible endpoints
With the latest update, we are extending the service configuration in dstack to enable you to optionally map your custom LLM to an OpenAI-compatible endpoint.
To learn more about how the new feature, read our blog post on it.
What's changed
- Make gateway active by @Egor-S in #829
- Implement OpenAI streaming for TGI by @Egor-S in #833
- Make get_latest_runner_build robuster for editable installs by @Egor-S in #834
- Fix descending logs by @r4victor in #839
- Reraise Jinja2 TemplateError by @Egor-S in #840
Full Changelog: 0.13.1...0.14.0
0.13.1
Mounting repos via Python API
If you submit a task or a service via the Python API, you can now specify the repo
with the Client.runs.submit
method.
This argument accepts an instance of dstack.api.LocalRepo
(which allows you to mount additional files to the run from a local folder), dstack.api.RemoteRepo
(which allows you to mount additional files to the run from a remote Git repo), or dstack.api.VirtualRepo
(which allows you to mount additional files to the run programmatically).
Here's an example:
repo=RemoteRepo.from_url(
repo_url="https://github.com/dstackai/dstack-examples",
repo_branch="main"
)
client.repos.init(repo)
run = client.runs.submit(
configuration=...,
repo=repo,
)
This allows you to access the additional files in your run from the mounted repo.
More examples are now available in the API documentation.
Note that the Python API is just one possible way to manage runs. Another one is the CLI. When using the CLI, it automatically mounts the repo in the current folder.
Bug-fixes
Among other improvements, the update addresses the issue that previously prevented the ability to pass custom arguments to the run using ${{ run.args }}
in the YAML configuration.
Here's an example:
type: task
python: "3.11" # (Optional) If not specified, your local version is used
commands:
- pip install -r requirements.txt
- python train.py ${{ run.args }}
``
Now, you can pass custom arguments to the run via `dstack run`:
```shell
dstack run . -f train.dstack.yml --gpu A100 --train_batch_size=1 --num_train_epochs=100
In this case --train_batch_size=1 --num_train_epochs=100
will be passed to python train.py
.
Contribution guide
Last but not least, we've extended our contribution guide with a new wiki page that guides you through the steps of adding a custom backend. This can be helpful if you decide to extend dstack with support for a custom backend (cloud provider).
Feel free to check out this new wiki page and share your feedback. As always, if you need help with adding custom backend support, you can always ask for assistance from our team.
0.13.0
Disk size
Previously, dstack
set the disk size to 100GB
regardless of the cloud provider. Now, to accommodate larger language
models and datasets, dstack
enables setting a custom disk size using --disk
in dstack run
or via the disk
property in .dstack/profiles.yml
.
CUDA 12.1
We've upgraded the default Docker image's CUDA drivers to 12.1 (for better compatibility with modern libraries).
Mixtral 8x7B
Lastly, and most importantly, we've added the example showing how to deploy Mixtral 8x7B as a service.
0.12.4
Bug-fixes
- Resolves issues related to TensorDock.
- Enhances error handling. Previously, server errors were only visible when the debug log level was set. Now, errors appear regardless of the log level.
- The
dstack.FineTuningTask
failed because of a missing file - Lastly, if you're using dstack Cloud, ensure you update to this version for compatibility.
0.12.3
Vast.ai
With dstack 0.12.3
, you can now use dstack
with Vast.ai, a marketplace providing GPUs from independent hosts at notably lower prices.
Configuring Vast.ai is very easy. Log into your Vast AI account, click Account in the sidebar, and copy your
API Key.
Then, go ahead and configure the backend via ~/.dstack/server/config.yml
:
projects:
- name: main
backends:
- type: vastai
creds:
type: api_key
api_key: d75789f22f1908e0527c78a283b523dd73051c8c7d05456516fc91e9d4efd8c5
Now you can restart the server and proceed to using the CLI or API for running development environments, tasks, and services.
$ dstack run --gpu 24GB --backend vastai --max-price 0.4
# REGION INSTANCE RESOURCES PRICE
1 pl-greaterpoland 6244171 16xCPU, 32GB, 1xRTX3090 (24GB) $0.18478
2 ee-harjumaa 6648481 16xCPU, 64GB, 1xA5000 (24GB) $0.29583
3 pl-greaterpoland 6244172 32xCPU, 64GB, 2XRTX3090 (24GB) $0.36678
Continue? [y/n]:
0.12.2
TensorDock
With dstack 0.12.2
, you can now access TensorDock's cloud GPUs, leveraging their highly competitive pricing.
Configuring your TensorDock account with dstack
is very easy. Simply generate an authorization key in your TensorDock
API settings and set it up in ~/.dstack/server/config.yml
:
projects:
- name: main
backends:
- type: tensordock
creds:
type: api_key
api_key: 248e621d-9317-7494-dc1557fa5825b-98b
api_token: FyBI3YbnFEYXdth2xqYRnQI7hiusssBC
Now you can restart the server and proceed to using the CLI or API for running development environments, tasks, and services.
dstack run . -f .dstack.yml --gpu 40GB
Min resources 1xGPU (40GB)
Max price -
Max duration 6h
Retry policy no
# REGION INSTANCE RESOURCES SPOT PRICE
1 unitedstates ef483076 10xCPU, 80GB, 1xA6000 (48GB) no $0.6235
2 canada 0ca177e7 10xCPU, 80GB, 1xA6000 (48GB) no $0.6435
3 canada 45d0cabd 10xCPU, 80GB, 1xA6000 (48GB) no $0.6435
...
Continue? [y/n]:
0.12.0
Server configuration
Previously, the only way to configure clouds for a project was through the UI. Additionally, you had to specify not only the credentials but also set up a storage bucket for each cloud to store metadata.
Now, you can configure clouds for a project via ~/.dstack/server/config.yml
. Example:
projects:
- name: main
backends:
- type: aws
creds:
type: access_key
access_key: AIZKISCVKUKO5AAKLAEH
secret_key: QSbmpqJIUBn1V5U3pyM9S6lwwiu8/fOJ2dgfwFdW
Enhanced Python API
The earlier introduced Python API is now greatly refined.
Creating a dstack
client is as easy as this:
from dstack.api import Client, ClientError
try:
client = Client.from_config()
except ClientError:
print("Can't connect to the server")
Now, you can submit a task or a service:
from dstack.api import Task, Resources, GPU
task = Task(
image="ghcr.io/huggingface/text-generation-inference:latest",
env={"MODEL_ID": "TheBloke/Llama-2-13B-chat-GPTQ"},
commands=[
"text-generation-launcher --trust-remote-code --quantize gptq",
],
ports=["80"],
)
run = client.runs.submit(
run_name="my-awesome-run",
configuration=task,
resources=Resources(gpu=GPU(memory="24GB")),
)
The dstack.api.Run
instance provides methods for various operations including attaching to the run,
forwarding ports to localhost
, retrieving status, stopping, and accessing logs. For more details, refer to
the example and reference.
Other changes
- Because we've prioritized CLI and API UX over the UI, the UI is no longer bundled.
Please inform us if you experience any significant inconvenience related to this. - Gateways should now be configured using the
dstack gateway
command, and their usage requires you to specify a domain.
Learn more about how to set up a gateway. - The
dstack start
command is nowdstack server
. - The Python API classes were moved from the
dstack
package todstack.api
.
Migration
Unfortunately, when upgrading to 0.12.0, there is no automatic migration for data.
This means you'll need to delete ~/.dstack
and configure dstack
from scratch.
pip install "dstack[all]==0.12.0"
- Delete
~/.dstack
- Configure clouds via
~/.dstack/server/config.yml
(see the new guide) - Run
dstack server
0.11.3
Full Changelog: 0.11.2...0.11.3
0.11.1
Default gateway
Previously, to run a service, you had to create a gateway using the dstack gateway create
command and pass its address via the gateway
property in the service configuration file.
Now, you don't need to use the gateway
property anymore, as you can create a gateway via the UI and mark it as default.
Gateway domain
Once the gateway is created (and assigned an external IP), you can set up an A record with your DNS provider to map *.<your domain name>
to the gateway's IP and specify this wildcard domain in the gateway's settings.
If a wildcard domain is configured, dstack automatically enables HTTPS and runs services at https://<run name>.<your domain name>
.
Retry policy
In other news, the update fixes a few bugs with the --retry-limit
argument in dstack run
. Now, it works again, allowing you to schedule tasks even if there is no required capacity at the moment.
Last but not least, we've updated the entire documentation and examples.