Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master' into runpod-ports
Browse files Browse the repository at this point in the history
  • Loading branch information
cblmemo committed Jul 30, 2024
2 parents 37fe1eb + 92727c7 commit 34f13a3
Show file tree
Hide file tree
Showing 95 changed files with 4,758 additions and 1,867 deletions.
5 changes: 1 addition & 4 deletions .github/workflows/format.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,18 +35,15 @@ jobs:
- name: Running yapf
run: |
yapf --diff --recursive ./ --exclude 'sky/skylet/ray_patches/**' \
--exclude 'sky/skylet/providers/azure/**' \
--exclude 'sky/skylet/providers/ibm/**'
- name: Running black
run: |
black --diff --check sky/skylet/providers/azure/ \
sky/skylet/providers/ibm/
black --diff --check sky/skylet/providers/ibm/
- name: Running isort for black formatted files
run: |
isort --diff --check --profile black -l 88 -m 3 \
sky/skylet/providers/ibm/
- name: Running isort for yapf formatted files
run: |
isort --diff --check ./ --sg 'sky/skylet/ray_patches/**' \
--sg 'sky/skylet/providers/azure/**' \
--sg 'sky/skylet/providers/ibm/**'
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,6 @@ sky/clouds/service_catalog/data_fetchers/*.csv
.vscode/
.idea/
.env

# For editor files
*.swp
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@

----
:fire: *News* :fire:
- [Jul, 2024] [Finetune](./llm/llama-3_1-finetuning/) and [serve](./llm/llama-3_1/) **Llama 3.1** on your infra
- [Jun, 2024] Reproduce **GPT** with [llm.c](https://github.com/karpathy/llm.c/discussions/481) on any cloud: [**guide**](./llm/gpt-2/)
- [Apr, 2024] Serve and finetune [**Llama 3**](https://skypilot.readthedocs.io/en/latest/gallery/llms/llama-3.html) on any cloud or Kubernetes: [**example**](./llm/llama-3/)
- [Apr, 2024] Serve [**Qwen-110B**](https://qwenlm.github.io/blog/qwen1.5-110b/) on your infra: [**example**](./llm/qwen/)
Expand Down Expand Up @@ -58,7 +59,7 @@ SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offer
SkyPilot **abstracts away cloud infra burdens**:
- Launch jobs & clusters on any cloud
- Easy scale-out: queue and run many jobs, automatically managed
- Easy access to object stores (S3, GCS, R2)
- Easy access to object stores (S3, GCS, Azure, R2, IBM)

SkyPilot **maximizes GPU availability for your jobs**:
* Provision in all zones/regions/clouds you have access to ([the _Sky_](https://arxiv.org/abs/2205.07147)), with automatic failover
Expand All @@ -70,13 +71,13 @@ SkyPilot **cuts your cloud costs**:

SkyPilot supports your existing GPU, TPU, and CPU workloads, with no code changes.

Install with pip (we recommend the nightly build for the latest features or [from source](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html)):
Install with pip:
```bash
pip install "skypilot-nightly[aws,gcp,azure,oci,lambda,runpod,fluidstack,paperspace,cudo,ibm,scp,kubernetes]" # choose your clouds
pip install -U "skypilot[aws,gcp,azure,oci,lambda,runpod,fluidstack,paperspace,cudo,ibm,scp,kubernetes]" # choose your clouds
```
To get the last release, use:
To get the latest features and fixes, use the nightly build or [install from source](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html):
```bash
pip install -U "skypilot[aws,gcp,azure,oci,lambda,runpod,fluidstack,paperspace,cudo,ibm,scp,kubernetes]" # choose your clouds
pip install "skypilot-nightly[aws,gcp,azure,oci,lambda,runpod,fluidstack,paperspace,cudo,ibm,scp,kubernetes]" # choose your clouds
```

Current supported providers (AWS, Azure, GCP, OCI, Lambda Cloud, RunPod, Fluidstack, Paperspace, Cudo, IBM, Samsung, Cloudflare, any Kubernetes cluster):
Expand Down
1 change: 1 addition & 0 deletions docs/source/_gallery_original/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Contents
DBRX (Databricks) <llms/dbrx>
Llama-2 (Meta) <llms/llama-2>
Llama-3 (Meta) <llms/llama-3>
Llama-3.1 (Meta) <llms/llama-3_1>
Qwen (Alibaba) <llms/qwen>
CodeLlama (Meta) <llms/codellama>
Gemma (Google) <llms/gemma>
Expand Down
1 change: 1 addition & 0 deletions docs/source/_gallery_original/llms/llama-3_1.md
4 changes: 1 addition & 3 deletions docs/source/_static/custom.js
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,7 @@ document.addEventListener('DOMContentLoaded', () => {
{ selector: '.caption-text', text: 'SkyServe: Model Serving' },
{ selector: '.toctree-l1 > a', text: 'Managed Jobs' },
{ selector: '.toctree-l1 > a', text: 'Running on Kubernetes' },
{ selector: '.toctree-l1 > a', text: 'Ollama' },
{ selector: '.toctree-l1 > a', text: 'Llama-3 (Meta)' },
{ selector: '.toctree-l1 > a', text: 'Qwen (Alibaba)' },
{ selector: '.toctree-l1 > a', text: 'Llama-3.1 (Meta)' },
];
newItems.forEach(({ selector, text }) => {
document.querySelectorAll(selector).forEach((el) => {
Expand Down
3 changes: 2 additions & 1 deletion docs/source/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ SkyPilot **abstracts away cloud infra burdens**:

- Launch jobs & clusters on any cloud
- Easy scale-out: queue and run many jobs, automatically managed
- Easy access to object stores (S3, GCS, R2)
- Easy access to object stores (S3, GCS, Azure, R2, IBM)

SkyPilot **maximizes GPU availability for your jobs**:

Expand Down Expand Up @@ -69,6 +69,7 @@ Runnable examples:
* **LLMs on SkyPilot**

* `Llama 3.1 finetuning <https://github.com/skypilot-org/skypilot/tree/master/llm/llama-3_1-finetuning>`_ and `serving <https://github.com/skypilot-org/skypilot/tree/master/llm/llama-3_1>`_
* `GPT-2 via llm.c <https://github.com/skypilot-org/skypilot/tree/master/llm/gpt-2>`_
* `Llama 3 <https://github.com/skypilot-org/skypilot/tree/master/llm/llama-3>`_
* `Qwen <https://github.com/skypilot-org/skypilot/tree/master/llm/qwen>`_
Expand Down
70 changes: 36 additions & 34 deletions docs/source/getting-started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,33 +11,6 @@ Install SkyPilot using pip:

.. tab-set::

.. tab-item:: Nightly (recommended)
:sync: nightly-tab

.. code-block:: shell
# Recommended: use a new conda env to avoid package conflicts.
# SkyPilot requires 3.7 <= python <= 3.11.
conda create -y -n sky python=3.10
conda activate sky
# Choose your cloud:
pip install "skypilot-nightly[aws]"
pip install "skypilot-nightly[gcp]"
pip install "skypilot-nightly[azure]"
pip install "skypilot-nightly[oci]"
pip install "skypilot-nightly[lambda]"
pip install "skypilot-nightly[runpod]"
pip install "skypilot-nightly[fluidstack]"
pip install "skypilot-nightly[paperspace]"
pip install "skypilot-nightly[cudo]"
pip install "skypilot-nightly[ibm]"
pip install "skypilot-nightly[scp]"
pip install "skypilot-nightly[vsphere]"
pip install "skypilot-nightly[kubernetes]"
pip install "skypilot-nightly[all]"
.. tab-item:: Latest Release
:sync: latest-release-tab

Expand Down Expand Up @@ -65,6 +38,35 @@ Install SkyPilot using pip:
pip install "skypilot[kubernetes]"
pip install "skypilot[all]"
.. tab-item:: Nightly
:sync: nightly-tab

.. code-block:: shell
# Recommended: use a new conda env to avoid package conflicts.
# SkyPilot requires 3.7 <= python <= 3.11.
conda create -y -n sky python=3.10
conda activate sky
# Choose your cloud:
pip install "skypilot-nightly[aws]"
pip install "skypilot-nightly[gcp]"
pip install "skypilot-nightly[azure]"
pip install "skypilot-nightly[oci]"
pip install "skypilot-nightly[lambda]"
pip install "skypilot-nightly[runpod]"
pip install "skypilot-nightly[fluidstack]"
pip install "skypilot-nightly[paperspace]"
pip install "skypilot-nightly[cudo]"
pip install "skypilot-nightly[ibm]"
pip install "skypilot-nightly[scp]"
pip install "skypilot-nightly[vsphere]"
pip install "skypilot-nightly[kubernetes]"
pip install "skypilot-nightly[all]"
.. tab-item:: From Source
:sync: from-source-tab

Expand Down Expand Up @@ -99,19 +101,19 @@ To use more than one cloud, combine the pip extras:

.. tab-set::

.. tab-item:: Nightly (recommended)
:sync: nightly-tab
.. tab-item:: Latest Release
:sync: latest-release-tab

.. code-block:: shell
pip install -U "skypilot-nightly[aws,gcp]"
pip install -U "skypilot[aws,gcp]"
.. tab-item:: Latest Release
:sync: latest-release-tab
.. tab-item:: Nightly
:sync: nightly-tab

.. code-block:: shell
pip install -U "skypilot[aws,gcp]"
pip install -U "skypilot-nightly[aws,gcp]"
.. tab-item:: From Source
:sync: from-source-tab
Expand Down Expand Up @@ -504,7 +506,7 @@ You can simply run:
-v "$HOME/.sky:/root/.sky:rw" \
-v "$HOME/.aws:/root/.aws:rw" \
-v "$HOME/.config/gcloud:/root/.config/gcloud:rw" \
berkeleyskypilot/skypilot-nightly
berkeleyskypilot/skypilot
docker exec -it sky /bin/bash
Expand Down
20 changes: 20 additions & 0 deletions docs/source/reference/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,14 @@ Available fields and semantics:
# - "*": my-default-security-group
security_group_name: my-security-group
# Encrypted boot disk (optional).
#
# Set to true to encrypt the boot disk of all AWS instances launched by
# SkyPilot. This is useful for compliance with data protection regulations.
#
# Default: false.
disk_encrypted: false
# Identity to use for AWS instances (optional).
#
# LOCAL_CREDENTIALS: The user's local credential files will be uploaded to
Expand Down Expand Up @@ -368,6 +376,18 @@ Available fields and semantics:
# Default: 'LOCAL_CREDENTIALS'.
remote_identity: LOCAL_CREDENTIALS
# Advanced Azure configurations (optional).
# Apply to all new instances but not existing ones.
azure:
# Specify an existing Azure storage account for SkyPilot-managed containers.
# If not set, SkyPilot will use its default naming convention to create and
# use the storage account unless container endpoint URI is used as source.
# Note: SkyPilot cannot create new storage accounts with custom names; it
# can only use existing ones or create accounts with its default naming
# scheme.
# Reference: https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview
storage_account: user-storage-account-name
# Advanced Kubernetes configurations (optional).
kubernetes:
# The networking mode for accessing SSH jump pod (optional).
Expand Down
21 changes: 14 additions & 7 deletions docs/source/reference/storage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Object storages are specified using the :code:`file_mounts` field in a SkyPilot
# Mount an existing S3 bucket
file_mounts:
/my_data:
source: s3://my-bucket/ # or gs://, r2://, cos://<region>/<bucket>
source: s3://my-bucket/ # or gs://, https://<azure_storage_account>.blob.core.windows.net/<container>, r2://, cos://<region>/<bucket>
mode: MOUNT # Optional: either MOUNT or COPY. Defaults to MOUNT.
This will `mount <storage-mounting-modes_>`__ the contents of the bucket at ``s3://my-bucket/`` to the remote VM at ``/my_data``.
Expand All @@ -45,7 +45,7 @@ Object storages are specified using the :code:`file_mounts` field in a SkyPilot
file_mounts:
/my_data:
name: my-sky-bucket
store: gcs # Optional: either of s3, gcs, r2, ibm
store: gcs # Optional: either of s3, gcs, azure, r2, ibm
SkyPilot will create an empty GCS bucket called ``my-sky-bucket`` and mount it at ``/my_data``.
This bucket can be used to write checkpoints, logs or other outputs directly to the cloud.
Expand All @@ -68,7 +68,7 @@ Object storages are specified using the :code:`file_mounts` field in a SkyPilot
/my_data:
name: my-sky-bucket
source: ~/dataset # Optional: path to local data to upload to the bucket
store: s3 # Optional: either of s3, gcs, r2, ibm
store: s3 # Optional: either of s3, gcs, azure, r2, ibm
mode: MOUNT # Optional: either MOUNT or COPY. Defaults to MOUNT.
SkyPilot will create a S3 bucket called ``my-sky-bucket`` and upload the
Expand Down Expand Up @@ -281,14 +281,21 @@ Storage YAML reference
source: str
The source attribute specifies the path that must be made available
in the storage object. It can either be a local path or a list of local
paths or it can be a remote path (s3://, gs://, r2://, cos://<region_name>).
in the storage object. It can either be:
- A local path
- A list of local paths
- A remote path using one of the following formats:
- s3://<bucket_name>
- gs://<bucket_name>
- https://<azure_storage_account>.blob.core.windows.net/<container_name>
- r2://<bucket_name>
- cos://<region_name>/<bucket_name>
If the source is local, data is uploaded to the cloud to an appropriate
bucket (s3, gcs, r2, or ibm). If source is bucket URI,
bucket (s3, gcs, azure, r2, or ibm). If source is bucket URI,
the data is copied or mounted directly (see mode flag below).
store: str; either of 's3', 'gcs', 'r2', 'ibm'
store: str; either of 's3', 'gcs', 'azure', 'r2', 'ibm'
If you wish to force sky.Storage to be backed by a specific cloud object
storage, you can specify it here. If not specified, SkyPilot chooses the
appropriate object storage based on the source path and task's cloud provider.
Expand Down
4 changes: 2 additions & 2 deletions docs/source/reference/yaml-spec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -300,8 +300,8 @@ Available fields:
# Mounts the bucket at /datasets-storage on every node of the cluster.
/datasets-storage:
name: sky-dataset # Name of storage, optional when source is bucket URI
source: /local/path/datasets # Source path, can be local or s3/gcs URL. Optional, do not specify to create an empty bucket.
store: s3 # Could be either 's3', 'gcs' or 'r2'; default: None. Optional.
source: /local/path/datasets # Source path, can be local or bucket URI. Optional, do not specify to create an empty bucket.
store: s3 # Could be either 's3', 'gcs', 'azure', 'r2', or 'ibm'; default: None. Optional.
persistent: True # Defaults to True; can be set to false to delete bucket after cluster is downed. Optional.
mode: MOUNT # Either MOUNT or COPY. Defaults to MOUNT. Optional.
Expand Down
Loading

0 comments on commit 34f13a3

Please sign in to comment.