diff --git a/contributing/BACKENDS.md b/contributing/BACKENDS.md index 834352118..d1871f840 100644 --- a/contributing/BACKENDS.md +++ b/contributing/BACKENDS.md @@ -84,59 +84,45 @@ See the Appendix at the end of this document and make sure the provider meets th #### 2.2. Set up the development environment -Follow [DEVELOPMENT.md](DEVELOPMENT.md)`. +Follow [DEVELOPMENT.md](DEVELOPMENT.md). #### 2.3. Add dependencies to setup.py Add any dependencies required by your cloud provider to `setup.py`. Create a separate section with the provider's name for these dependencies, and ensure that you update the `all` section to include them as well. -#### 2.4. Implement the provider backend +#### 2.4. Add a new backend type -##### 2.4.1. Define the backend type +Add a new enumeration member for your provider to `BackendType` ([`src/dstack/_internal/core/models/backends/base.py`](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/models/backends/base.py)). -Add a new enumeration member for your provider to `BackendType` (`src/dstack/_internal/core/models/backends/base.py`). -Use the name of the provider. +#### 2.5. Create backend files and classes -##### 2.4.2. Create the backend directory +`dstack` provides a helper script to generate all the necessary files and classes for a new backend. +To add a new backend named `ExampleXYZ`, you should run: -Create a new directory under `src/dstack/_internal/core/backends` with the name of the backend type. - -##### 2.4.3. Create the backend class - -Under the backend directory you've created, create the `backend.py` file and define the -backend class there (should extend `dstack._internal.core.backends.base.Backend`). - -Refer to examples: -[datacrunch](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/datacrunch/backend.py), -[aws](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/aws/backend.py), -[gcp](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/gcp/backend.py), -[azure](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/azure/backend.py), etc. +```shell +python scripts/add_backend.py -n ExampleXYZ +``` -##### 2.4.4. Create the backend compute class +It will create an `examplexyz` backend directory under `src/dstack/_internal/core/backends` with the following files: -Under the backend directory you've created, create the `compute.py` file and define the -backend compute class that extends the `dstack._internal.core.backends.base.compute.Compute` class. -It can also extend and implement `ComputeWith*` classes to support additional features such as fleets, volumes, gateways, placement groups, etc. For example, it should extend `ComputeWithCreateInstanceSupport` to support fleets. +* `backend.py` with the `Backend` class implementation. You typically don't need to modify it. +* `compute.py` with the `Compute` class implementation. This is the core of the backend that you need to implement. +* `configurator.py` with the `Configurator` class implementation. It deals with validating and storing backend config. You need to adjust it with custom backend config validation. +* `models.py` with all the backend config models used by `Backend`, `Compute`, `Configurator` and other parts of `dstack`. -Refer to examples: -[datacrunch](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/datacrunch/compute.py), -[aws](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/aws/compute.py), -[gcp](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/gcp/compute.py), -[azure](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/azure/compute.py), etc. +##### 2.6. Adjust and register the backend config models -##### 2.4.5. Create and register the backend config models - -Under the backend directory, create the `models.py` file and define the backend config model classes there. -Every backend must define at least two models: +Go to `models.py`. It'll contain two config models required for all backends: * `*BackendConfig` that contains all backend parameters available for user configuration except for creds. * `*BackendConfigWithCreds` that contains all backends parameters available for user configuration and also creds. -These models are used in server/config.yaml, the API, and for backend configuration. +Adjust generated config models by adding additional config parameters. +Typically you'd need to only modify the `*BackendConfig` model since other models extend it. -The models should be added to `AnyBackendConfig*` unions in [`src/dstack/_internal/core/backends/models.py`](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/models.py). +Then add these models to `AnyBackendConfig*` unions in [`src/dstack/_internal/core/backends/models.py`](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/models.py). -It's not required but recommended to also define `*BackendStoredConfig` that extends `*BackendConfig` to be able to store extra parameters in the DB. By the same logic, it's recommended to define `*Config` that extends `*BackendStoredConfig` with creds and use it as the main `Backend` and `Compute` config instead of using `*BackendConfigWithCreds` directly. +The script also generates `*BackendStoredConfig` that extends `*BackendConfig` to be able to store extra parameters in the DB. By the same logic, it generates `*Config` that extends `*BackendStoredConfig` with creds and uses it as the main `Backend` and `Compute` config instead of using `*BackendConfigWithCreds` directly. Refer to examples: [datacrunch](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/datacrunch/models.py), @@ -144,9 +130,21 @@ Refer to examples: [gcp](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/gcp/models.py), [azure](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/models.py), etc. -##### 2.4.6. Create and register the configurator class +##### 2.7. Implement the backend compute class + +Go to `compute.py` and implement `Compute` methods. +Optionally, extend and implement `ComputeWith*` classes to support additional features such as fleets, volumes, gateways, placement groups, etc. For example, extend `ComputeWithCreateInstanceSupport` to support fleets. + +Refer to examples: +[datacrunch](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/datacrunch/compute.py), +[aws](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/aws/compute.py), +[gcp](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/gcp/compute.py), +[azure](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/azure/compute.py), etc. + +##### 2.8. Implement and register the configurator class -Under the backend directory, create the `configurator.py` file and and define the backend configurator class (must extend `dstack._internal.core.backends.base.configurator.Configurator`). +Go to `configurator.py` and implement custom `Configurator` logic. At minimum, you should implement creds validation. +You may also need to validate other config parameters if there are any. Refer to examples: [datacrunch](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/datacrunch/configurator.py), [aws](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/aws/configurator.py), @@ -155,7 +153,7 @@ Refer to examples: [datacrunch](https://github.com/dstackai/dstack/blob/master/s Register configurator by appending it to `_CONFIGURATOR_CLASSES` in [`src/dstack/_internal/core/backends/configurators.py`](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/configurators.py). -##### 2.4.7. (Optional) Override provisioning timeout +##### 2.9. (Optional) Override provisioning timeout If instances in the backend take more than 10 minutes to start, override the default provisioning timeout in [`src/dstack/_internal/server/background/tasks/common.py`](https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/server/background/tasks/common.py). diff --git a/scripts/add_backend.py b/scripts/add_backend.py new file mode 100644 index 000000000..a18e48c7f --- /dev/null +++ b/scripts/add_backend.py @@ -0,0 +1,46 @@ +import argparse +from pathlib import Path + +import jinja2 + + +def main(): + parser = argparse.ArgumentParser( + description="This script generates boilerplate code for a new backend" + ) + parser.add_argument( + "-n", + "--name", + help=( + "The backend name in CamelCase, e.g. AWS, Runpod, VastAI." + " It'll be used for naming backend classes, models, etc." + ), + required=True, + ) + args = parser.parse_args() + generate_backend_code(args.name) + + +def generate_backend_code(backend_name: str): + template_dir_path = Path(__file__).parent.parent.joinpath( + "src/dstack/_internal/core/backends/template" + ) + env = jinja2.Environment( + loader=jinja2.FileSystemLoader( + searchpath=template_dir_path, + ), + keep_trailing_newline=True, + ) + backend_dir_path = Path(__file__).parent.parent.joinpath( + f"src/dstack/_internal/core/backends/{backend_name.lower()}" + ) + backend_dir_path.mkdir(exist_ok=True) + for filename in ["backend.py", "compute.py", "configurator.py", "models.py"]: + template = env.get_template(f"{filename}.jinja") + with open(backend_dir_path.joinpath(filename), "w+") as f: + f.write(template.render({"backend_name": backend_name})) + backend_dir_path.joinpath("__init__.py").write_text("") + + +if __name__ == "__main__": + main() diff --git a/src/dstack/_internal/core/backends/base/compute.py b/src/dstack/_internal/core/backends/base/compute.py index c1185d084..ac57a1314 100644 --- a/src/dstack/_internal/core/backends/base/compute.py +++ b/src/dstack/_internal/core/backends/base/compute.py @@ -60,6 +60,11 @@ def __init__(self): def get_offers( self, requirements: Optional[Requirements] = None ) -> List[InstanceOfferWithAvailability]: + """ + Returns offers with availability matching `requirements`. + If the provider is added to gpuhunt, typically gets offers using `base.offers.get_catalog_offers()` + and extends them with availability info. + """ pass @abstractmethod diff --git a/src/dstack/_internal/core/backends/template/__init__.py b/src/dstack/_internal/core/backends/template/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/dstack/_internal/core/backends/template/backend.py.jinja b/src/dstack/_internal/core/backends/template/backend.py.jinja new file mode 100644 index 000000000..d52cf73ea --- /dev/null +++ b/src/dstack/_internal/core/backends/template/backend.py.jinja @@ -0,0 +1,16 @@ +from dstack._internal.core.backends.base.backend import Backend +from dstack._internal.core.backends.{{ backend_name|lower }}.compute import {{ backend_name }}Compute +from dstack._internal.core.backends.{{ backend_name|lower }}.models import {{ backend_name }}Config +from dstack._internal.core.models.backends.base import BackendType + + +class {{ backend_name }}Backend(Backend): + TYPE = BackendType.{{ backend_name|upper }} + COMPUTE_CLASS = {{ backend_name }}Compute + + def __init__(self, config: {{ backend_name }}Config): + self.config = config + self._compute = {{ backend_name }}Compute(self.config) + + def compute(self) -> {{ backend_name }}Compute: + return self._compute diff --git a/src/dstack/_internal/core/backends/template/compute.py.jinja b/src/dstack/_internal/core/backends/template/compute.py.jinja new file mode 100644 index 000000000..d39bf671c --- /dev/null +++ b/src/dstack/_internal/core/backends/template/compute.py.jinja @@ -0,0 +1,87 @@ +from typing import List, Optional + +from dstack._internal.core.backends.base.backend import Compute +from dstack._internal.core.backends.base.compute import ( + ComputeWithCreateInstanceSupport, + ComputeWithGatewaySupport, + ComputeWithMultinodeSupport, + ComputeWithPlacementGroupSupport, + ComputeWithPrivateGatewaySupport, + ComputeWithReservationSupport, + ComputeWithVolumeSupport, +) +from dstack._internal.core.backends.base.offers import get_catalog_offers +from dstack._internal.core.backends.{{ backend_name|lower }}.models import {{ backend_name }}Config +from dstack._internal.core.models.backends.base import BackendType +from dstack._internal.core.models.instances import ( + InstanceAvailability, + InstanceConfiguration, + InstanceOfferWithAvailability, +) +from dstack._internal.core.models.runs import Job, JobProvisioningData, Requirements, Run +from dstack._internal.core.models.volumes import Volume +from dstack._internal.utils.logging import get_logger + +logger = get_logger(__name__) + + +class {{ backend_name }}Compute( + # TODO: Choose ComputeWith* classes to extend and implement + # ComputeWithCreateInstanceSupport, + # ComputeWithMultinodeSupport, + # ComputeWithReservationSupport, + # ComputeWithPlacementGroupSupport, + # ComputeWithGatewaySupport, + # ComputeWithPrivateGatewaySupport, + # ComputeWithVolumeSupport, + Compute, +): + def __init__(self, config: {{ backend_name }}Config): + super().__init__() + self.config = config + + def get_offers( + self, requirements: Optional[Requirements] = None + ) -> List[InstanceOfferWithAvailability]: + # If the provider is added to gpuhunt, you'd typically get offers + # using `get_catalog_offers()` and extend them with availability info. + offers = get_catalog_offers( + backend=BackendType.{{ backend_name|upper }}, + locations=self.config.regions or None, + requirements=requirements, + # configurable_disk_size=..., TODO: set in case of boot volume size limits + ) + # TODO: Add availability info to offers + return [ + InstanceOfferWithAvailability( + **offer.dict(), + availability=InstanceAvailability.UNKNOWN, + ) + for offer in offers + ] + + def create_instance( + self, + instance_offer: InstanceOfferWithAvailability, + instance_config: InstanceConfiguration, + ) -> JobProvisioningData: + # TODO: Implement if backend supports creating instances (VM-based). + # Delete if backend can only run jobs (container-based). + raise NotImplementedError() + + def run_job( + self, + run: Run, + job: Job, + instance_offer: InstanceOfferWithAvailability, + project_ssh_public_key: str, + project_ssh_private_key: str, + volumes: List[Volume], + ) -> JobProvisioningData: + # TODO: Implement if create_instance() is not implemented. Delete otherwise. + raise NotImplementedError() + + def terminate_instance( + self, instance_id: str, region: str, backend_data: Optional[str] = None + ): + raise NotImplementedError() diff --git a/src/dstack/_internal/core/backends/template/configurator.py.jinja b/src/dstack/_internal/core/backends/template/configurator.py.jinja new file mode 100644 index 000000000..736a9c1a2 --- /dev/null +++ b/src/dstack/_internal/core/backends/template/configurator.py.jinja @@ -0,0 +1,70 @@ +import json + +from dstack._internal.core.backends.base.configurator import ( + BackendRecord, + Configurator, + raise_invalid_credentials_error, +) +from dstack._internal.core.backends.{{ backend_name|lower }}.backend import {{ backend_name }}Backend +from dstack._internal.core.backends.{{ backend_name|lower }}.models import ( + Any{{ backend_name }}BackendConfig, + Any{{ backend_name }}Creds, + {{ backend_name }}BackendConfig, + {{ backend_name }}BackendConfigWithCreds, + {{ backend_name }}Config, + {{ backend_name }}Creds, + {{ backend_name }}StoredConfig, +) +from dstack._internal.core.models.backends.base import ( + BackendType, +) + +# TODO: Add all supported regions and default regions +REGIONS = [] + + +class {{ backend_name }}Configurator(Configurator): + TYPE = BackendType.{{ backend_name|upper }} + BACKEND_CLASS = {{ backend_name }}Backend + + def validate_config( + self, config: {{ backend_name }}BackendConfigWithCreds, default_creds_enabled: bool + ): + self._validate_creds(config.creds) + # TODO: Validate additional config parameters if any + + def create_backend( + self, project_name: str, config: {{ backend_name }}BackendConfigWithCreds + ) -> BackendRecord: + if config.regions is None: + config.regions = REGIONS + return BackendRecord( + config={{ backend_name }}StoredConfig( + **{{ backend_name }}BackendConfig.__response__.parse_obj(config).dict() + ).json(), + auth={{ backend_name }}Creds.parse_obj(config.creds).json(), + ) + + def get_backend_config( + self, record: BackendRecord, include_creds: bool + ) -> Any{{ backend_name }}BackendConfig: + config = self._get_config(record) + if include_creds: + return {{ backend_name }}BackendConfigWithCreds.__response__.parse_obj(config) + return {{ backend_name }}BackendConfig.__response__.parse_obj(config) + + def get_backend(self, record: BackendRecord) -> {{ backend_name }}Backend: + config = self._get_config(record) + return {{ backend_name }}Backend(config=config) + + def _get_config(self, record: BackendRecord) -> {{ backend_name }}Config: + return {{ backend_name }}Config.__response__( + **json.loads(record.config), + creds={{ backend_name }}Creds.parse_raw(record.auth), + ) + + def _validate_creds(self, creds: Any{{ backend_name }}Creds): + # TODO: Implement API key or other creds validation + # if valid: + # return + raise_invalid_credentials_error(fields=[["creds", "api_key"]]) diff --git a/src/dstack/_internal/core/backends/template/models.py.jinja b/src/dstack/_internal/core/backends/template/models.py.jinja new file mode 100644 index 000000000..6cb022c21 --- /dev/null +++ b/src/dstack/_internal/core/backends/template/models.py.jinja @@ -0,0 +1,58 @@ +from typing import Annotated, List, Literal, Optional, Union + +from pydantic import Field + +from dstack._internal.core.models.common import CoreModel + + +# The template uses "api_key" creds as the most popular creds type. +# TODO: Adjust it or add additional creds models if necessary. +class {{ backend_name }}APIKeyCreds(CoreModel): + type: Annotated[Literal["api_key"], Field(description="The type of credentials")] = "api_key" + api_key: Annotated[str, Field(description="The API key")] + + +Any{{ backend_name }}Creds = {{ backend_name }}APIKeyCreds +{{ backend_name }}Creds = Any{{ backend_name }}Creds + + +class {{ backend_name }}BackendConfig(CoreModel): + """ + The backend config used in the API, server/config.yml, `{{ backend_name }}Configurator`. + It also serves as a base class for other backend config models. + Should not include creds. + """ + type: Annotated[ + Literal["{{ backend_name|lower }}"], + Field(description="The type of backend"), + ] = "{{ backend_name|lower }}" + regions: Annotated[ + Optional[List[str]], + Field(description="The list of {{ backend_name }} regions. Omit to use all regions"), + ] = None + # TODO: Add additional backend parameters if necessary + + +class {{ backend_name }}BackendConfigWithCreds({{ backend_name }}BackendConfig): + """ + Same as `{{ backend_name }}BackendConfig` but also includes creds. + """ + creds: Annotated[Any{{ backend_name }}Creds, Field(description="The credentials")] + + +Any{{ backend_name }}BackendConfig = Union[{{ backend_name }}BackendConfig, {{ backend_name }}BackendConfigWithCreds] + + +class {{ backend_name }}StoredConfig({{ backend_name }}BackendConfig): + """ + The backend config used for config parameters in the DB. + Can extend `{{ backend_name }}BackendConfig` with additional parameters. + """ + pass + + +class {{ backend_name }}Config({{ backend_name }}StoredConfig): + """ + The backend config used by `{{ backend_name }}Backend` and `{{ backend_name }}Compute`. + """ + creds: Any{{ backend_name }}Creds