@@ -142,8 +134,6 @@ $ dstack init
A service allows you to deploy a model or any web app as an endpoint.
- #### Define a configuration
-
Create the following configuration file inside the repo:
- #### Run the configuration
-
Run the configuration via [`dstack apply`](reference/cli/dstack/apply.md):
@@ -216,9 +204,7 @@ and runs the configuration.
Something not working? See the [troubleshooting](guides/troubleshooting.md) guide.
-## What's next?
-
-1. Read about [dev environments](dev-environments.md), [tasks](tasks.md), [services](services.md),
- and [repos](concepts/repos.md)
-2. Join [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
-3. Browse [examples](https://dstack.ai/examples)
+!!! info "What's next?"
+ 1. Read about [backends](concepts/backends.md), [dev environments](concepts/dev-environments.md), [tasks](concepts/tasks.md), and [services](concepts/services.md)
+ 2. Join [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd)
+ 3. Browse [examples](https://dstack.ai/examples)
diff --git a/docs/docs/reference/dstack.yml/dev-environment.md b/docs/docs/reference/dstack.yml/dev-environment.md
index 00c8a6bc4..2bcd9794c 100644
--- a/docs/docs/reference/dstack.yml/dev-environment.md
+++ b/docs/docs/reference/dstack.yml/dev-environment.md
@@ -1,272 +1,6 @@
-# dev-environment
+# `dev-environment`
-The `dev-environment` configuration type allows running [dev environments](../../dev-environments.md).
-
-> Configuration files must be inside the project repo, and their names must end with `.dstack.yml`
-> (e.g. `.dstack.yml` or `dev.dstack.yml` are both acceptable).
-> Any configuration can be run via [`dstack apply`](../cli/dstack/apply.md).
-
-## Examples
-
-### Python version
-
-If you don't specify `image`, `dstack` uses its base Docker image pre-configured with
-`python`, `pip`, `conda` (Miniforge), and essential CUDA drivers.
-The `python` property determines which default Docker image is used.
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-# If `image` is not specified, dstack uses its base image
-python: "3.10"
-
-ide: vscode
-```
-
-
-
-??? info "nvcc"
- By default, the base Docker image doesn’t include `nvcc`, which is required for building custom CUDA kernels.
- If you need `nvcc`, set the corresponding property to true.
-
- ```yaml
- type: dev-environment
- # The name is optional, if not specified, generated randomly
- name: vscode
-
- # If `image` is not specified, dstack uses its base image
- python: "3.10"
- # Ensure nvcc is installed (req. for Flash Attention)
- nvcc: true
-
- ide: vscode
- ```
-
-### Docker
-
-If you want, you can specify your own Docker image via `image`.
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-# Any custom Docker image
-image: ghcr.io/huggingface/text-generation-inference:latest
-
-ide: vscode
-```
-
-
-
-??? info "Private registry"
-
- Use the `registry_auth` property to provide credentials for a private Docker registry.
-
- ```yaml
- type: dev-environment
- # The name is optional, if not specified, generated randomly
- name: vscode
-
- # Any private Docker image
- image: ghcr.io/huggingface/text-generation-inference:latest
- # Credentials of the private Docker registry
- registry_auth:
- username: peterschmidt85
- password: ghp_e49HcZ9oYwBzUbcSk2080gXZOU2hiT9AeSR5
-
- ide: vscode
- ```
-
-!!! info "Docker and Docker Compose"
- All backends except `runpod`, `vastai`, and `kubernetes` also allow using [Docker and Docker Compose](../../guides/protips.md#docker-and-docker-compose) inside `dstack` runs.
-
-### Resources { #_resources }
-
-When you specify a resource value like `cpu` or `memory`,
-you can either use an exact value (e.g. `24GB`) or a
-range (e.g. `24GB..`, or `24GB..80GB`, or `..80GB`).
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-ide: vscode
-
-resources:
- # 200GB or more RAM
- memory: 200GB..
- # 4 GPUs from 40GB to 80GB
- gpu: 40GB..80GB:4
- # Shared memory (required by multi-gpu)
- shm_size: 16GB
- # Disk size
- disk: 500GB
-```
-
-
-
-The `gpu` property allows specifying not only memory size but also GPU vendor, names
-and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10G,A100` (either A10G or A100),
-`A100:80GB` (one A100 of 80GB), `A100:2` (two A100), `24GB..40GB:2` (two GPUs between 24GB and 40GB),
-`A100:40GB:2` (two A100 GPUs of 40GB).
-
-??? info "Google Cloud TPU"
- To use TPUs, specify its architecture via the `gpu` property.
-
- ```yaml
- type: dev-environment
- # The name is optional, if not specified, generated randomly
- name: vscode
-
- ide: vscode
-
- resources:
- gpu: v2-8
- ```
-
- Currently, only 8 TPU cores can be specified, supporting single TPU device workloads. Multi-TPU support is coming soon.
-
-??? info "Shared memory"
- If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure
- `shm_size`, e.g. set it to `16GB`.
-
-### Environment variables
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-# Environment variables
-env:
- - HF_TOKEN
- - HF_HUB_ENABLE_HF_TRANSFER=1
-
-ide: vscode
-```
-
-
-
-If you don't assign a value to an environment variable (see `HF_TOKEN` above),
-`dstack` will require the value to be passed via the CLI or set in the current process.
-For instance, you can define environment variables in a `.envrc` file and utilize tools like `direnv`.
-
-#### System environment variables
-
-The following environment variables are available in any run by default:
-
-| Name | Description |
-|-------------------------|-----------------------------------------|
-| `DSTACK_RUN_NAME` | The name of the run |
-| `DSTACK_REPO_ID` | The ID of the repo |
-| `DSTACK_GPUS_NUM` | The total number of GPUs in the run |
-
-### Spot policy
-
-You can choose whether to use spot instances, on-demand instances, or any available type.
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-ide: vscode
-
-# Uncomment to leverage spot instances
-#spot_policy: auto
-```
-
-
-
-The `spot_policy` accepts `spot`, `on-demand`, and `auto`. The default for dev environments is `on-demand`.
-
-### Backends
-
-By default, `dstack` provisions instances in all configured backends. However, you can specify the list of backends:
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-ide: vscode
-
-# Use only listed backends
-backends: [aws, gcp]
-```
-
-
-
-### Regions
-
-By default, `dstack` uses all configured regions. However, you can specify the list of regions:
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-ide: vscode
-
-# Use only listed regions
-regions: [eu-west-1, eu-west-2]
-```
-
-
-
-### Volumes
-
-Volumes allow you to persist data between runs.
-To attach a volume, simply specify its name using the `volumes` property and specify where to mount its contents:
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-ide: vscode
-
-# Map the name of the volume to any path
-volumes:
- - name: my-new-volume
- path: /volume_data
-```
-
-
-
-Once you run this configuration, the contents of the volume will be attached to `/volume_data` inside the dev
-environment, and its contents will persist across runs.
-
-??? Info "Instance volumes"
- If data persistence is not a strict requirement, use can also use
- ephemeral [instance volumes](../../concepts/volumes.md#instance-volumes).
-
-!!! info "Limitations"
- When you're running a dev environment, task, or service with `dstack`, it automatically mounts the project folder contents
- to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to
- attach volumes to `/workflow` or any of its subdirectories.
-
-The `dev-environment` configuration type supports many other options. See below.
+The `dev-environment` configuration type allows running [dev environments](../../concepts/dev-environments.md).
## Root reference
@@ -276,7 +10,7 @@ The `dev-environment` configuration type supports many other options. See below.
type:
required: true
-## `retry`
+### `retry`
#SCHEMA# dstack._internal.core.models.profiles.ProfileRetry
overrides:
@@ -284,7 +18,7 @@ The `dev-environment` configuration type supports many other options. See below.
type:
required: true
-## `resources`
+### `resources`
#SCHEMA# dstack._internal.core.models.resources.ResourcesSpecSchema
overrides:
@@ -293,7 +27,7 @@ The `dev-environment` configuration type supports many other options. See below.
required: true
item_id_prefix: resources-
-## `resources.gpu` { #resources-gpu data-toc-label="resources.gpu" }
+#### `resources.gpu` { #resources-gpu data-toc-label="gpu" }
#SCHEMA# dstack._internal.core.models.resources.GPUSpecSchema
overrides:
@@ -301,7 +35,7 @@ The `dev-environment` configuration type supports many other options. See below.
type:
required: true
-## `resources.disk` { #resources-disk data-toc-label="resources.disk" }
+#### `resources.disk` { #resources-disk data-toc-label="disk" }
#SCHEMA# dstack._internal.core.models.resources.DiskSpecSchema
overrides:
@@ -309,7 +43,7 @@ The `dev-environment` configuration type supports many other options. See below.
type:
required: true
-## `registry_auth`
+### `registry_auth`
#SCHEMA# dstack._internal.core.models.configurations.RegistryAuth
overrides:
@@ -317,7 +51,7 @@ The `dev-environment` configuration type supports many other options. See below.
type:
required: true
-## `volumes[n]` { #_volumes data-toc-label="volumes" }
+### `volumes[n]` { #_volumes data-toc-label="volumes" }
=== "Network volumes"
@@ -340,4 +74,4 @@ The `dev-environment` configuration type supports many other options. See below.
The short syntax for volumes is a colon-separated string in the form of `source:destination`
* `volume-name:/container/path` for network volumes
- * `/instance/path:/container/path` for instance volumes
+ * `/instance/path:/container/path` for instance volumes
diff --git a/docs/docs/reference/dstack.yml/fleet.md b/docs/docs/reference/dstack.yml/fleet.md
index ffcaad705..537ddb109 100644
--- a/docs/docs/reference/dstack.yml/fleet.md
+++ b/docs/docs/reference/dstack.yml/fleet.md
@@ -1,67 +1,7 @@
-# fleet
+# `fleet`
The `fleet` configuration type allows creating and updating fleets.
-> Configuration files must be inside the project repo, and their names must end with `.dstack.yml`
-> (e.g. `.dstack.yml` or `fleet.dstack.yml` are both acceptable).
-> Any configuration can be run via [`dstack apply`](../cli/dstack/apply.md).
-
-## Examples
-
-### Cloud fleet
-
-
-
-```yaml
-type: fleet
-# The name is optional, if not specified, generated randomly
-name: my-fleet
-
-# The number of instances
-nodes: 4
-# Ensure the instances are interconnected
-placement: cluster
-
-# Uncomment to leverage spot instances
-#spot_policy: auto
-
-resources:
- gpu:
- # 24GB or more vRAM
- memory: 24GB..
- # One or more GPU
- count: 1..
-```
-
-
-
-### SSH fleet
-
-
-
-```yaml
-type: fleet
-# The name is optional, if not specified, generated randomly
-name: my-ssh-fleet
-
-# Ensure instances are interconnected
-placement: cluster
-
-# The user, private SSH key, and hostnames of the on-prem servers
-ssh_config:
- user: ubuntu
- identity_file: ~/.ssh/id_rsa
- hosts:
- - 3.255.177.51
- - 3.255.177.52
-```
-
-
-
-[//]: # (TODO: a cluster, individual user and identity file, etc)
-
-[//]: # (TODO: other examples, for all properties like in dev-environment/task/service)
-
## Root reference
#SCHEMA# dstack._internal.core.models.fleets.FleetConfiguration
@@ -70,19 +10,20 @@ ssh_config:
type:
required: true
-## `ssh_config`
+### `ssh_config` { data-toc-label="ssh_config" }
#SCHEMA# dstack._internal.core.models.fleets.SSHParams
overrides:
show_root_heading: false
+ item_id_prefix: ssh_config-
-## `ssh_config.hosts[n]`
+#### `ssh_config.hosts[n]` { #ssh_config-hosts data-toc-label="hosts" }
#SCHEMA# dstack._internal.core.models.fleets.SSHHostParams
overrides:
show_root_heading: false
-## `resources`
+### `resources`
#SCHEMA# dstack._internal.core.models.resources.ResourcesSpecSchema
overrides:
@@ -91,7 +32,7 @@ ssh_config:
required: true
item_id_prefix: resources-
-## `resouces.gpu` { #resources-gpu data-toc-label="resources.gpu" }
+#### `resouces.gpu` { #resources-gpu data-toc-label="gpu" }
#SCHEMA# dstack._internal.core.models.resources.GPUSpecSchema
overrides:
@@ -99,7 +40,7 @@ ssh_config:
type:
required: true
-## `resouces.disk` { #resources-disk data-toc-label="resources.disk" }
+#### `resouces.disk` { #resources-disk data-toc-label="disk" }
#SCHEMA# dstack._internal.core.models.resources.DiskSpecSchema
overrides:
@@ -107,7 +48,7 @@ ssh_config:
type:
required: true
-## `retry`
+### `retry`
#SCHEMA# dstack._internal.core.models.profiles.ProfileRetry
overrides:
diff --git a/docs/docs/reference/dstack.yml/gateway.md b/docs/docs/reference/dstack.yml/gateway.md
index 73bb06d7f..4d81d5d50 100644
--- a/docs/docs/reference/dstack.yml/gateway.md
+++ b/docs/docs/reference/dstack.yml/gateway.md
@@ -1,34 +1,7 @@
-# gateway
+# `gateway`
The `gateway` configuration type allows creating and updating [gateways](../../concepts/gateways.md).
-> Configuration files must be inside the project repo, and their names must end with `.dstack.yml`
-> (e.g. `.dstack.yml` or `gateway.dstack.yml` are both acceptable).
-> Any configuration can be run via [`dstack apply`](../cli/dstack/apply.md).
-
-## Examples
-
-### Creating a new gateway { #new-gateway }
-
-
-
-```yaml
-type: gateway
-# A name of the gateway
-name: example-gateway
-
-# Gateways are bound to a specific backend and region
-backend: aws
-region: eu-west-1
-
-# This domain will be used to access the endpoint
-domain: example.com
-```
-
-
-
-[//]: # (TODO: other examples, e.g. private gateways)
-
## Root reference
#SCHEMA# dstack._internal.core.models.gateways.GatewayConfiguration
@@ -37,18 +10,20 @@ domain: example.com
type:
required: true
-## `certificate[type=lets-encrypt]`
+### `certificate`
-#SCHEMA# dstack._internal.core.models.gateways.LetsEncryptGatewayCertificate
- overrides:
- show_root_heading: false
- type:
- required: true
+=== "Let's encrypt"
-## `certificate[type=acm]`
+ #SCHEMA# dstack._internal.core.models.gateways.LetsEncryptGatewayCertificate
+ overrides:
+ show_root_heading: false
+ type:
+ required: true
-#SCHEMA# dstack._internal.core.models.gateways.ACMGatewayCertificate
- overrides:
- show_root_heading: false
- type:
- required: true
+=== "ACM"
+
+ #SCHEMA# dstack._internal.core.models.gateways.ACMGatewayCertificate
+ overrides:
+ show_root_heading: false
+ type:
+ required: true
diff --git a/docs/docs/reference/dstack.yml/service.md b/docs/docs/reference/dstack.yml/service.md
index 1ad737b11..8d661743c 100644
--- a/docs/docs/reference/dstack.yml/service.md
+++ b/docs/docs/reference/dstack.yml/service.md
@@ -1,428 +1,6 @@
-# service
+# `service`
-The `service` configuration type allows running [services](../../services.md).
-
-> Configuration files must be inside the project repo, and their names must end with `.dstack.yml`
-> (e.g. `.dstack.yml` or `serve.dstack.yml` are both acceptable).
-> Any configuration can be run via [`dstack apply`](../cli/dstack/apply.md).
-
-## Examples
-
-### Python version
-
-If you don't specify `image`, `dstack` uses its base Docker image pre-configured with
-`python`, `pip`, `conda` (Miniforge), and essential CUDA drivers.
-The `python` property determines which default Docker image is used.
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: http-server-service
-
-# If `image` is not specified, dstack uses its base image
-python: "3.10"
-
-# Commands of the service
-commands:
- - python3 -m http.server
-# The port of the service
-port: 8000
-```
-
-
-
-??? info "nvcc"
- By default, the base Docker image doesn’t include `nvcc`, which is required for building custom CUDA kernels.
- If you need `nvcc`, set the corresponding property to true.
-
-
-
- ```yaml
- type: service
- # The name is optional, if not specified, generated randomly
- name: http-server-service
-
- # If `image` is not specified, dstack uses its base image
- python: "3.10"
- # Ensure nvcc is installed (req. for Flash Attention)
- nvcc: true
-
- # Commands of the service
- commands:
- - python3 -m http.server
- # The port of the service
- port: 8000
- ```
-
-
-
-### Docker
-
-If you want, you can specify your own Docker image via `image`.
-
-
-
- ```yaml
- type: service
- # The name is optional, if not specified, generated randomly
- name: http-server-service
-
- # Any custom Docker image
- image: dstackai/base:py3.13-0.6-cuda-12.1
-
- # Commands of the service
- commands:
- - python3 -m http.server
- # The port of the service
- port: 8000
- ```
-
-
-
-??? info "Private Docker registry"
-
- Use the `registry_auth` property to provide credentials for a private Docker registry.
-
- ```yaml
- type: service
- # The name is optional, if not specified, generated randomly
- name: http-server-service
-
- # Any private Docker iamge
- image: dstackai/base:py3.13-0.6-cuda-12.1
- # Credentials of the private registry
- registry_auth:
- username: peterschmidt85
- password: ghp_e49HcZ9oYwBzUbcSk2080gXZOU2hiT9AeSR5
-
- # Commands of the service
- commands:
- - python3 -m http.server
- # The port of the service
- port: 8000
- ```
-
-!!! info "Docker and Docker Compose"
- All backends except `runpod`, `vastai`, and `kubernetes` also allow using [Docker and Docker Compose](../../guides/protips.md#docker-and-docker-compose) inside `dstack` runs.
-
-### Models { #model-mapping }
-
-If you are running a chat model with an OpenAI-compatible interface,
-set the [`model`](#model) property to make the model accessible via
-the OpenAI-compatible endpoint provided by `dstack`.
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: llama31-service
-
-python: "3.10"
-
-# Required environment variables
-env:
- - HF_TOKEN
-commands:
- - pip install vllm
- - vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
-# Expose the port of the service
-port: 8000
-
-resources:
- # Change to what is required
- gpu: 24GB
-
-# Register the model
-model: meta-llama/Meta-Llama-3.1-8B-Instruct
-
-# Alternatively, use this syntax to set more model settings:
-# model:
-# type: chat
-# name: meta-llama/Meta-Llama-3.1-8B-Instruct
-# format: openai
-# prefix: /v1
-```
-
-
-
-Once the service is up, the model will be available via the OpenAI-compatible endpoint
-at `
/proxy/models/`
-or at `https://gateway.` if your project has a gateway.
-
-### Auto-scaling
-
-By default, `dstack` runs a single replica of the service.
-You can configure the number of replicas as well as the auto-scaling rules.
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: llama31-service
-
-python: "3.10"
-
-# Required environment variables
-env:
- - HF_TOKEN
-commands:
- - pip install vllm
- - vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct --max-model-len 4096
-# Expose the port of the service
-port: 8000
-
-resources:
- # Change to what is required
- gpu: 24GB
-
-# Minimum and maximum number of replicas
-replicas: 1..4
-scaling:
- # Requests per seconds
- metric: rps
- # Target metric value
- target: 10
-```
-
-
-
-The [`replicas`](#replicas) property can be a number or a range.
-
-> The [`metric`](#metric) property of [`scaling`](#scaling) only supports the `rps` metric (requests per second). In this
-> case `dstack` adjusts the number of replicas (scales up or down) automatically based on the load.
-
-Setting the minimum number of replicas to `0` allows the service to scale down to zero when there are no requests.
-
-!!! info "Gateway"
- Services with a fixed number of replicas are supported both with and without a
- [gateway](../../concepts/gateways.md).
- Auto-scaling is currently only supported for services running with a gateway.
-
-### Resources { #_resources }
-
-If you specify memory size, you can either specify an explicit size (e.g. `24GB`) or a
-range (e.g. `24GB..`, or `24GB..80GB`, or `..80GB`).
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: http-server-service
-
-python: "3.10"
-
-# Commands of the service
-commands:
- - pip install vllm
- - python -m vllm.entrypoints.openai.api_server
- --model mistralai/Mixtral-8X7B-Instruct-v0.1
- --host 0.0.0.0
- --tensor-parallel-size $DSTACK_GPUS_NUM
-# Expose the port of the service
-port: 8000
-
-resources:
- # 2 GPUs of 80GB
- gpu: 80GB:2
-
- # Minimum disk size
- disk: 200GB
-```
-
-
-
-The `gpu` property allows specifying not only memory size but also GPU vendor, names
-and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10G,A100` (either A10G or A100),
-`A100:80GB` (one A100 of 80GB), `A100:2` (two A100), `24GB..40GB:2` (two GPUs between 24GB and 40GB),
-`A100:40GB:2` (two A100 GPUs of 40GB).
-
-??? info "Shared memory"
- If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure
- `shm_size`, e.g. set it to `16GB`.
-
-### Authorization
-
-By default, the service endpoint requires the `Authorization` header with `"Bearer "`.
-Authorization can be disabled by setting `auth` to `false`.
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: http-server-service
-
-# Disable authorization
-auth: false
-
-python: "3.10"
-
-# Commands of the service
-commands:
- - python3 -m http.server
-# The port of the service
-port: 8000
-```
-
-
-
-### Environment variables
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: llama-2-7b-service
-
-python: "3.10"
-
-# Environment variables
-env:
- - HF_TOKEN
- - MODEL=NousResearch/Llama-2-7b-chat-hf
-# Commands of the service
-commands:
- - pip install vllm
- - python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
-# The port of the service
-port: 8000
-
-resources:
- # Required GPU vRAM
- gpu: 24GB
-```
-
-
-
-> If you don't assign a value to an environment variable (see `HF_TOKEN` above),
-`dstack` will require the value to be passed via the CLI or set in the current process.
-For instance, you can define environment variables in a `.envrc` file and utilize tools like `direnv`.
-
-#### System environment variables
-
-The following environment variables are available in any run by default:
-
-| Name | Description |
-|-------------------------|-----------------------------------------|
-| `DSTACK_RUN_NAME` | The name of the run |
-| `DSTACK_REPO_ID` | The ID of the repo |
-| `DSTACK_GPUS_NUM` | The total number of GPUs in the run |
-
-### Spot policy
-
-You can choose whether to use spot instances, on-demand instances, or any available type.
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: http-server-service
-
-commands:
- - python3 -m http.server
-# The port of the service
-port: 8000
-
-# Uncomment to leverage spot instances
-#spot_policy: auto
-```
-
-
-
-The `spot_policy` accepts `spot`, `on-demand`, and `auto`. The default for services is `on-demand`.
-
-### Backends
-
-By default, `dstack` provisions instances in all configured backends. However, you can specify the list of backends:
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: http-server-service
-
-# Commands of the service
-commands:
- - python3 -m http.server
-# The port of the service
-port: 8000
-
-# Use only listed backends
-backends: [aws, gcp]
-```
-
-
-
-### Regions
-
-By default, `dstack` uses all configured regions. However, you can specify the list of regions:
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: http-server-service
-
-# Commands of the service
-commands:
- - python3 -m http.server
-# The port of the service
-port: 8000
-
-# Use only listed regions
-regions: [eu-west-1, eu-west-2]
-```
-
-
-
-### Volumes
-
-Volumes allow you to persist data between runs.
-To attach a volume, simply specify its name using the `volumes` property and specify where to mount its contents:
-
-
-
-```yaml
-type: service
-# The name is optional, if not specified, generated randomly
-name: http-server-service
-
-# Commands of the service
-commands:
- - python3 -m http.server
-# The port of the service
-port: 8000
-
-# Map the name of the volume to any path
-volumes:
- - name: my-new-volume
- path: /volume_data
-```
-
-
-
-Once you run this configuration, the contents of the volume will be attached to `/volume_data` inside the service,
-and its contents will persist across runs.
-
-??? Info "Instance volumes"
- If data persistence is not a strict requirement, use can also use
- ephemeral [instance volumes](../../concepts/volumes.md#instance-volumes).
-
-!!! info "Limitations"
- When you're running a dev environment, task, or service with `dstack`, it automatically mounts the project folder contents
- to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to
- attach volumes to `/workflow` or any of its subdirectories.
-
-The `service` configuration type supports many other options. See below.
+The `service` configuration type allows running [services](../../concepts/services.md).
## Root reference
@@ -432,62 +10,64 @@ The `service` configuration type supports many other options. See below.
type:
required: true
-## `model[format=openai]`
-
-#SCHEMA# dstack._internal.core.models.gateways.OpenAIChatModel
- overrides:
- show_root_heading: false
- type:
- required: true
-
-## `model[format=tgi]`
-
-> TGI provides an OpenAI-compatible API starting with version 1.4.0,
-so models served by TGI can be defined with `format: openai` too.
-
-#SCHEMA# dstack._internal.core.models.gateways.TGIChatModel
- overrides:
- show_root_heading: false
- type:
- required: true
-
-??? info "Chat template"
-
- By default, `dstack` loads the [chat template](https://huggingface.co/docs/transformers/main/en/chat_templating)
- from the model's repository. If it is not present there, manual configuration is required.
-
- ```yaml
- type: service
-
- image: ghcr.io/huggingface/text-generation-inference:latest
- env:
- - MODEL_ID=TheBloke/Llama-2-13B-chat-GPTQ
- commands:
- - text-generation-launcher --port 8000 --trust-remote-code --quantize gptq
- port: 8000
-
- resources:
- gpu: 80GB
+### `model` { data-toc-label="model" }
- # Enable the OpenAI-compatible endpoint
- model:
- type: chat
- name: TheBloke/Llama-2-13B-chat-GPTQ
- format: tgi
- chat_template: "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = false %}{% endif %}{% for message in loop_messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if loop.index0 == 0 and system_message != false %}{% set content = '<>\\n' + system_message + '\\n<>\\n\\n' + message['content'] %}{% else %}{% set content = message['content'] %}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + content.strip() + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ ' ' + content.strip() + ' ' }}{% endif %}{% endfor %}"
- eos_token: ""
- ```
+=== "OpenAI"
- ##### Limitations
+ #SCHEMA# dstack._internal.core.models.gateways.OpenAIChatModel
+ overrides:
+ show_root_heading: false
+ type:
+ required: true
- Please note that model mapping is an experimental feature with the following limitations:
+=== "TGI"
- 1. Doesn't work if your `chat_template` uses `bos_token`. As a workaround, replace `bos_token` inside `chat_template` with the token content itself.
- 2. Doesn't work if `eos_token` is defined in the model repository as a dictionary. As a workaround, set `eos_token` manually, as shown in the example above (see Chat template).
+ > TGI provides an OpenAI-compatible API starting with version 1.4.0,
+ so models served by TGI can be defined with `format: openai` too.
+
+ #SCHEMA# dstack._internal.core.models.gateways.TGIChatModel
+ overrides:
+ show_root_heading: false
+ type:
+ required: true
- If you encounter any other issues, please make sure to file a [GitHub issue](https://github.com/dstackai/dstack/issues/new/choose).
+ ??? info "Chat template"
+
+ By default, `dstack` loads the [chat template](https://huggingface.co/docs/transformers/main/en/chat_templating)
+ from the model's repository. If it is not present there, manual configuration is required.
+
+ ```yaml
+ type: service
+
+ image: ghcr.io/huggingface/text-generation-inference:latest
+ env:
+ - MODEL_ID=TheBloke/Llama-2-13B-chat-GPTQ
+ commands:
+ - text-generation-launcher --port 8000 --trust-remote-code --quantize gptq
+ port: 8000
+
+ resources:
+ gpu: 80GB
+
+ # Enable the OpenAI-compatible endpoint
+ model:
+ type: chat
+ name: TheBloke/Llama-2-13B-chat-GPTQ
+ format: tgi
+ chat_template: "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = false %}{% endif %}{% for message in loop_messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if loop.index0 == 0 and system_message != false %}{% set content = '<>\\n' + system_message + '\\n<>\\n\\n' + message['content'] %}{% else %}{% set content = message['content'] %}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + content.strip() + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ ' ' + content.strip() + ' ' }}{% endif %}{% endfor %}"
+ eos_token: ""
+ ```
+
+ ##### Limitations
+
+ Please note that model mapping is an experimental feature with the following limitations:
+
+ 1. Doesn't work if your `chat_template` uses `bos_token`. As a workaround, replace `bos_token` inside `chat_template` with the token content itself.
+ 2. Doesn't work if `eos_token` is defined in the model repository as a dictionary. As a workaround, set `eos_token` manually, as shown in the example above (see Chat template).
+
+ If you encounter any other issues, please make sure to file a [GitHub issue](https://github.com/dstackai/dstack/issues/new/choose).
-## `scaling`
+### `scaling`
#SCHEMA# dstack._internal.core.models.configurations.ScalingSpec
overrides:
@@ -495,13 +75,13 @@ so models served by TGI can be defined with `format: openai` too.
type:
required: true
-## `retry`
+### `retry`
#SCHEMA# dstack._internal.core.models.profiles.ProfileRetry
overrides:
show_root_heading: false
-## `resources`
+### `resources`
#SCHEMA# dstack._internal.core.models.resources.ResourcesSpecSchema
overrides:
@@ -510,7 +90,7 @@ so models served by TGI can be defined with `format: openai` too.
required: true
item_id_prefix: resources-
-## `resouces.gpu` { #resources-gpu data-toc-label="resources.gpu" }
+#### `resouces.gpu` { #resources-gpu data-toc-label="gpu" }
#SCHEMA# dstack._internal.core.models.resources.GPUSpecSchema
overrides:
@@ -518,7 +98,7 @@ so models served by TGI can be defined with `format: openai` too.
type:
required: true
-## `resouces.disk` { #resources-disk data-toc-label="resources.disk" }
+#### `resouces.disk` { #resources-disk data-toc-label="disk" }
#SCHEMA# dstack._internal.core.models.resources.DiskSpecSchema
overrides:
@@ -526,7 +106,7 @@ so models served by TGI can be defined with `format: openai` too.
type:
required: true
-## `registry_auth`
+### `registry_auth`
#SCHEMA# dstack._internal.core.models.configurations.RegistryAuth
overrides:
@@ -534,7 +114,7 @@ so models served by TGI can be defined with `format: openai` too.
type:
required: true
-## `volumes[n]` { #_volumes data-toc-label="volumes" }
+### `volumes[n]` { #_volumes data-toc-label="volumes" }
=== "Network volumes"
diff --git a/docs/docs/reference/dstack.yml/task.md b/docs/docs/reference/dstack.yml/task.md
index 3c99eb8a2..f08cf06cb 100644
--- a/docs/docs/reference/dstack.yml/task.md
+++ b/docs/docs/reference/dstack.yml/task.md
@@ -1,448 +1,6 @@
-# task
+# `task`
-The `task` configuration type allows running [tasks](../../tasks.md).
-
-> Configuration files must be inside the project repo, and their names must end with `.dstack.yml`
-> (e.g. `.dstack.yml` or `train.dstack.yml` are both acceptable).
-> Any configuration can be run via [`dstack apply`](../cli/dstack/apply.md).
-
-## Examples
-
-### Python version
-
-If you don't specify `image`, `dstack` uses its base Docker image pre-configured with
-`python`, `pip`, `conda` (Miniforge), and essential CUDA drivers.
-The `python` property determines which default Docker image is used.
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: train
-
-# If `image` is not specified, dstack uses its base image
-python: "3.10"
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-```
-
-
-
-??? info "nvcc"
- By default, the base Docker image doesn’t include `nvcc`, which is required for building custom CUDA kernels.
- If you need `nvcc`, set the corresponding property to true.
-
-
- ```yaml
- type: task
- # The name is optional, if not specified, generated randomly
- name: train
-
- # If `image` is not specified, dstack uses its base image
- python: "3.10"
- # Ensure nvcc is installed (req. for Flash Attention)
- nvcc: true
-
- commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
- ```
-
-### Ports { #_ports }
-
-A task can configure ports. In this case, if the task is running an application on a port, `dstack run`
-will securely allow you to access this port from your local machine through port forwarding.
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: train
-
-python: "3.10"
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - tensorboard --logdir results/runs &
- - python fine-tuning/qlora/train.py
-# Expose the port to access TensorBoard
-ports:
- - 6000
-```
-
-
-
-When running it, `dstack run` forwards `6000` port to `localhost:6000`, enabling secure access.
-
-[//]: # (See [tasks](../../tasks.md#configure-ports) for more detail.)
-
-### Docker
-
-If you want, you can specify your own Docker image via `image`.
-
-
-
-```yaml
-type: dev-environment
-# The name is optional, if not specified, generated randomly
-name: train
-
-# Any custom Docker image
-image: dstackai/base:py3.13-0.6-cuda-12.1
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-```
-
-
-
-??? info "Private registry"
- Use the `registry_auth` property to provide credentials for a private Docker registry.
-
- ```yaml
- type: dev-environment
- # The name is optional, if not specified, generated randomly
- name: train
-
- # Any private Docker image
- image: dstackai/base:py3.13-0.6-cuda-12.1
- # Credentials of the private Docker registry
- registry_auth:
- username: peterschmidt85
- password: ghp_e49HcZ9oYwBzUbcSk2080gXZOU2hiT9AeSR5
-
- # Commands of the task
- commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
- ```
-
-!!! info "Docker and Docker Compose"
- All backends except `runpod`, `vastai`, and `kubernetes` also allow using [Docker and Docker Compose](../../guides/protips.md#docker-and-docker-compose) inside `dstack` runs.
-
-### Resources { #_resources }
-
-If you specify memory size, you can either specify an explicit size (e.g. `24GB`) or a
-range (e.g. `24GB..`, or `24GB..80GB`, or `..80GB`).
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: train
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-
-resources:
- # 200GB or more RAM
- memory: 200GB..
- # 4 GPUs from 40GB to 80GB
- gpu: 40GB..80GB:4
- # Shared memory (required by multi-gpu)
- shm_size: 16GB
- # Disk size
- disk: 500GB
-```
-
-
-
-The `gpu` property allows specifying not only memory size but also GPU vendor, names
-and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10G,A100` (either A10G or A100),
-`A100:80GB` (one A100 of 80GB), `A100:2` (two A100), `24GB..40GB:2` (two GPUs between 24GB and 40GB),
-`A100:40GB:2` (two A100 GPUs of 40GB).
-
-??? info "Google Cloud TPU"
- To use TPUs, specify its architecture via the `gpu` property.
-
- ```yaml
- type: task
- # The name is optional, if not specified, generated randomly
- name: train
-
- python: "3.10"
-
- # Commands of the task
- commands:
- - pip install torch~=2.3.0 torch_xla[tpu]~=2.3.0 torchvision -f https://storage.googleapis.com/libtpu-releases/index.html
- - git clone --recursive https://github.com/pytorch/xla.git
- - python3 xla/test/test_train_mp_imagenet.py --fake_data --model=resnet50 --num_epochs=1
-
- resources:
- gpu: v2-8
- ```
-
- Currently, only 8 TPU cores can be specified, supporting single host workloads. Multi-host support is coming soon.
-
-??? info "Shared memory"
- If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure
- `shm_size`, e.g. set it to `16GB`.
-
-### Environment variables
-
-
-
-```yaml
-type: task
-
-python: "3.10"
-
-# Environment variables
-env:
- - HF_TOKEN
- - HF_HUB_ENABLE_HF_TRANSFER=1
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-```
-
-
-
-If you don't assign a value to an environment variable (see `HF_TOKEN` above),
-`dstack` will require the value to be passed via the CLI or set in the current process.
-For instance, you can define environment variables in a `.envrc` file and utilize tools like `direnv`.
-
-##### System environment variables
-
-The following environment variables are available in any run by default:
-
-| Name | Description |
-|-------------------------|------------------------------------------------------------------|
-| `DSTACK_RUN_NAME` | The name of the run |
-| `DSTACK_REPO_ID` | The ID of the repo |
-| `DSTACK_GPUS_NUM` | The total number of GPUs in the run |
-| `DSTACK_NODES_NUM` | The number of nodes in the run |
-| `DSTACK_GPUS_PER_NODE` | The number of GPUs per node |
-| `DSTACK_NODE_RANK` | The rank of the node |
-| `DSTACK_MASTER_NODE_IP` | The internal IP address the master node |
-| `DSTACK_NODES_IPS` | The list of internal IP addresses of all nodes delimited by "\n" |
-
-### Distributed tasks
-
-By default, a task runs on a single node. However, you can run it on a cluster of nodes by specifying `nodes`:
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: train-distrib
-
-# The size of the cluster
-nodes: 2
-
-python: "3.10"
-
-# Commands of the task
-commands:
- - pip install -r requirements.txt
- - torchrun
- --nproc_per_node=$DSTACK_GPUS_PER_NODE
- --node_rank=$DSTACK_NODE_RANK
- --nnodes=$DSTACK_NODES_NUM
- --master_addr=$DSTACK_MASTER_NODE_IP
- --master_port=8008 resnet_ddp.py
- --num_epochs 20
-
-resources:
- gpu: 24GB
-```
-
-
-
-If you run the task, `dstack` first provisions the master node and then runs the other nodes of the cluster.
-
-??? info "Network"
- To ensure all nodes are provisioned into a cluster placement group and to enable the highest level of inter-node
- connectivity, it is recommended to manually create a [fleet](../../concepts/fleets.md) before running a task.
- This won’t be needed once [this issue :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/issues/1805){:target="_blank"}
- is fixed.
-
-> `dstack` is easy to use with `accelerate`, `torchrun`, and other distributed frameworks. All you need to do
-is pass the corresponding environment variables such as `DSTACK_GPUS_PER_NODE`, `DSTACK_NODE_RANK`, `DSTACK_NODES_NUM`,
-`DSTACK_MASTER_NODE_IP`, and `DSTACK_GPUS_NUM` (see [System environment variables](#system-environment-variables)).
-
-??? info "Backends"
- Running on multiple nodes is supported only with the `aws`, `gcp`, `azure`, `oci` backends, or
- [SSH fleets](../../concepts/fleets.md#ssh-fleets).
-
- Additionally, the `aws` backend supports [Elastic Fabric Adapter :material-arrow-top-right-thin:{ .external }](https://aws.amazon.com/hpc/efa/){:target="_blank"}.
- For a list of instance types with EFA support see [Fleets](../../concepts/fleets.md#cloud-fleets).
-
-### Web applications
-
-Here's an example of using `ports` to run web apps with `tasks`.
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: streamlit-hello
-
-python: "3.10"
-
-# Commands of the task
-commands:
- - pip3 install streamlit
- - streamlit hello
-# Expose the port to access the web app
-ports:
- - 8501
-
-```
-
-
-
-### Spot policy
-
-You can choose whether to use spot instances, on-demand instances, or any available type.
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: train
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-
-# Uncomment to leverage spot instances
-#spot_policy: auto
-```
-
-
-
-The `spot_policy` accepts `spot`, `on-demand`, and `auto`. The default for tasks is `on-demand`.
-
-### Queueing tasks { #queueing-tasks }
-
-By default, if `dstack apply` cannot find capacity, the task fails.
-
-To queue the task and wait for capacity, specify the [`retry`](#retry)
-property:
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: train
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-
-retry:
- # Retry on no-capacity errors
- on_events: [no-capacity]
- # Retry within 1 day
- duration: 1d
-```
-
-
-
-### Backends
-
-By default, `dstack` provisions instances in all configured backends. However, you can specify the list of backends:
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: train
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-
-# Use only listed backends
-backends: [aws, gcp]
-```
-
-
-
-### Regions
-
-By default, `dstack` uses all configured regions. However, you can specify the list of regions:
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: train
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-
-# Use only listed regions
-regions: [eu-west-1, eu-west-2]
-```
-
-
-
-### Volumes
-
-Volumes allow you to persist data between runs.
-To attach a volume, simply specify its name using the `volumes` property and specify where to mount its contents:
-
-
-
-```yaml
-type: task
-# The name is optional, if not specified, generated randomly
-name: vscode
-
-python: "3.10"
-
-# Commands of the task
-commands:
- - pip install -r fine-tuning/qlora/requirements.txt
- - python fine-tuning/qlora/train.py
-
-# Map the name of the volume to any path
-volumes:
- - name: my-new-volume
- path: /volume_data
-```
-
-
-
-Once you run this configuration, the contents of the volume will be attached to `/volume_data` inside the task,
-and its contents will persist across runs.
-
-??? Info "Instance volumes"
- If data persistence is not a strict requirement, use can also use
- ephemeral [instance volumes](../../concepts/volumes.md#instance-volumes).
-
-!!! info "Limitations"
- When you're running a dev environment, task, or service with `dstack`, it automatically mounts the project folder contents
- to `/workflow` (and sets that as the current working directory). Right now, `dstack` doesn't allow you to
- attach volumes to `/workflow` or any of its subdirectories.
-
-The `task` configuration type supports many other options. See below.
+The `task` configuration type allows running [tasks](../../concepts/tasks.md).
## Root reference
@@ -452,7 +10,7 @@ The `task` configuration type supports many other options. See below.
type:
required: true
-## `retry`
+### `retry`
#SCHEMA# dstack._internal.core.models.profiles.ProfileRetry
overrides:
@@ -460,7 +18,7 @@ The `task` configuration type supports many other options. See below.
type:
required: true
-## `resources`
+### `resources`
#SCHEMA# dstack._internal.core.models.resources.ResourcesSpecSchema
overrides:
@@ -469,7 +27,7 @@ The `task` configuration type supports many other options. See below.
required: true
item_id_prefix: resources-
-## `resouces.gpu` { #resources-gpu data-toc-label="resources.gpu" }
+#### `resouces.gpu` { #resources-gpu data-toc-label="gpu" }
#SCHEMA# dstack._internal.core.models.resources.GPUSpecSchema
overrides:
@@ -477,7 +35,7 @@ The `task` configuration type supports many other options. See below.
type:
required: true
-## `resouces.disk` { #resources-disk data-toc-label="resources.disk" }
+#### `resouces.disk` { #resources-disk data-toc-label="disk" }
#SCHEMA# dstack._internal.core.models.resources.DiskSpecSchema
overrides:
@@ -485,7 +43,7 @@ The `task` configuration type supports many other options. See below.
type:
required: true
-## `registry_auth`
+### `registry_auth`
#SCHEMA# dstack._internal.core.models.configurations.RegistryAuth
overrides:
@@ -493,7 +51,7 @@ The `task` configuration type supports many other options. See below.
type:
required: true
-## `volumes[n]` { #_volumes data-toc-label="volumes" }
+### `volumes[n]` { #_volumes data-toc-label="volumes" }
=== "Network volumes"
diff --git a/docs/docs/reference/dstack.yml/volume.md b/docs/docs/reference/dstack.yml/volume.md
index 246270ab2..af34a166a 100644
--- a/docs/docs/reference/dstack.yml/volume.md
+++ b/docs/docs/reference/dstack.yml/volume.md
@@ -1,52 +1,7 @@
-# volume
+# `volume`
The `volume` configuration type allows creating, registering, and updating [volumes](../../concepts/volumes.md).
-> Configuration files must be inside the project repo, and their names must end with `.dstack.yml`
-> (e.g. `.dstack.yml` or `fleet.dstack.yml` are both acceptable).
-> Any configuration can be run via [`dstack apply`](../cli/dstack/apply.md).
-
-## Examples
-
-### Creating a new volume { #new-volume }
-
-
-
-```yaml
-type: volume
-# The name of the volume
-name: my-new-volume
-
-# Volumes are bound to a specific backend and region
-backend: aws
-region: eu-central-1
-
-# The size of the volume
-size: 100GB
-```
-
-
-
-### Registering an existing volume { #existing-volume }
-
-
-
-```yaml
-type: volume
-# The name of the volume
-name: my-existing-volume
-
-# Volumes are bound to a specific backend and region
-backend: aws
-region: eu-central-1
-
-# The ID of the volume in AWS
-volume_id: vol1235
-```
-
-
-
-
## Root reference
#SCHEMA# dstack._internal.core.models.volumes.VolumeConfiguration
diff --git a/docs/docs/reference/misc/environment-variables.md b/docs/docs/reference/misc/environment-variables.md
index 43b600a46..ee9024277 100644
--- a/docs/docs/reference/misc/environment-variables.md
+++ b/docs/docs/reference/misc/environment-variables.md
@@ -5,7 +5,7 @@
The following read-only environment variables are automatically propagated to configurations for dev environments,
tasks, and services:
-##### DSTACK_RUN_NAME { #DSTACK_RUN_NAME }
+###### DSTACK_RUN_NAME { #DSTACK_RUN_NAME }
The name of the run.
@@ -21,11 +21,11 @@ commands:
If `name` is not set in the configuration, it is assigned a random name (e.g. `wet-mangust-1`).
-##### DSTACK_REPO_ID { #DSTACK_REPO_ID }
+###### DSTACK_REPO_ID { #DSTACK_REPO_ID }
The ID of the repo
-##### DSTACK_GPUS_NUM { #DSTACK_GPUS_NUM }
+###### DSTACK_GPUS_NUM { #DSTACK_GPUS_NUM }
The total number of GPUs in the run
@@ -49,19 +49,19 @@ resources:
gpu: 24GB
```
-##### DSTACK_NODES_NUM { #DSTACK_NODES_NUM }
+###### DSTACK_NODES_NUM { #DSTACK_NODES_NUM }
The number of nodes in the run
-##### DSTACK_GPUS_PER_NODE { #DSTACK_GPUS_PER_NODE }
+###### DSTACK_GPUS_PER_NODE { #DSTACK_GPUS_PER_NODE }
The number of GPUs per node
-##### DSTACK_NODE_RANK { #DSTACK_NODE_RANK }
+###### DSTACK_NODE_RANK { #DSTACK_NODE_RANK }
The rank of the node
-##### DSTACK_NODE_RANK { #DSTACK_NODE_RANK }
+###### DSTACK_NODE_RANK { #DSTACK_NODE_RANK }
The internal IP address the master node.
@@ -90,7 +90,7 @@ resources:
gpu: 24GB
```
-##### DSTACK_NODES_IPS { #DSTACK_NODES_IPS }
+###### DSTACK_NODES_IPS { #DSTACK_NODES_IPS }
The list of internal IP addresses of all nodes delimited by `"\n"`
@@ -102,7 +102,7 @@ via `dstack server` or deployed using Docker.
For more details on the options below, refer to the [server deployment](../../guides/server-deployment.md) guide.
-##### DSTACK_SERVER_LOG_LEVEL { #DSTACK_SERVER_LOG_LEVEL }
+###### DSTACK_SERVER_LOG_LEVEL { #DSTACK_SERVER_LOG_LEVEL }
Has the same effect as `--log-level`. Defaults to `INFO`.
@@ -117,43 +117,43 @@ $ DSTACK_SERVER_LOG_LEVEL=debug dstack server
-##### DSTACK_SERVER_LOG_FORMAT { #DSTACK_SERVER_LOG_FORMAT }
+###### DSTACK_SERVER_LOG_FORMAT { #DSTACK_SERVER_LOG_FORMAT }
Sets format of log output. Can be `rich`, `standard`, `json`. Defaults to `rich`.
-##### DSTACK_SERVER_HOST { #DSTACK_SERVER_HOST }
+###### DSTACK_SERVER_HOST { #DSTACK_SERVER_HOST }
Has the same effect as `--host`. Defaults to `127.0.0.1`.
-##### DSTACK_SERVER_PORT { #DSTACK_SERVER_PORT }
+###### DSTACK_SERVER_PORT { #DSTACK_SERVER_PORT }
Has the same effect as `--port`. Defaults to `3000`.
-##### DSTACK_SERVER_ADMIN_TOKEN { #DSTACK_SERVER_ADMIN_TOKEN }
+###### DSTACK_SERVER_ADMIN_TOKEN { #DSTACK_SERVER_ADMIN_TOKEN }
Has the same effect as `--token`. Defaults to `None`.
-##### DSTACK_SERVER_DIR { #DSTACK_SERVER_DIR }
+###### DSTACK_SERVER_DIR { #DSTACK_SERVER_DIR }
Sets path to store data and server configs. Defaults to `~/.dstack/server`.
-##### DSTACK_DATABASE_URL { #DSTACK_DATABASE_URL }
+###### DSTACK_DATABASE_URL { #DSTACK_DATABASE_URL }
The database URL to use instead of default SQLite. Currently `dstack` supports Postgres. Example: `postgresql+asyncpg://myuser:mypassword@localhost:5432/mydatabase`. Defaults to `None`.
-##### DSTACK_SERVER_CLOUDWATCH_LOG_GROUP { #DSTACK_SERVER_CLOUDWATCH_LOG_GROUP }
+###### DSTACK_SERVER_CLOUDWATCH_LOG_GROUP { #DSTACK_SERVER_CLOUDWATCH_LOG_GROUP }
The CloudWatch Logs group for workloads logs. If not set, the default file-based log storage is used.
-##### DSTACK_SERVER_CLOUDWATCH_LOG_REGION { #DSTACK_SERVER_CLOUDWATCH_LOG_REGION }
+###### DSTACK_SERVER_CLOUDWATCH_LOG_REGION { #DSTACK_SERVER_CLOUDWATCH_LOG_REGION }
The CloudWatch Logs region. Defaults to `None`.
-##### DSTACK_DEFAULT_SERVICE_CLIENT_MAX_BODY_SIZE { #DSTACK_DEFAULT_SERVICE_CLIENT_MAX_BODY_SIZE }
+###### DSTACK_DEFAULT_SERVICE_CLIENT_MAX_BODY_SIZE { #DSTACK_DEFAULT_SERVICE_CLIENT_MAX_BODY_SIZE }
Request body size limit for services, in bytes. Defaults to 64 MiB.
-##### DSTACK_FORBID_SERVICES_WITHOUT_GATEWAY { #DSTACK_FORBID_SERVICES_WITHOUT_GATEWAY }
+###### DSTACK_FORBID_SERVICES_WITHOUT_GATEWAY { #DSTACK_FORBID_SERVICES_WITHOUT_GATEWAY }
Forbids registering new services without a gateway if set to any value.
@@ -172,7 +172,7 @@ Forbids registering new services without a gateway if set to any value.
The following environment variables are supported by the CLI.
-##### DSTACK_CLI_LOG_LEVEL { #DSTACK_CLI_LOG_LEVEL }
+###### DSTACK_CLI_LOG_LEVEL { #DSTACK_CLI_LOG_LEVEL }
Configures CLI logging level. Defaults to `INFO`.
@@ -186,6 +186,6 @@ $ DSTACK_CLI_LOG_LEVEL=debug dstack apply -f .dstack.yml