To launch an example:
# Recommended: YAML + CLI
sky launch examples/<name>.yaml
# Advanced: programmatic API
python examples/<name>.py
Machine learning examples:
-
huggingface_glue_imdb_app.yaml
: Use Huggingface Transformers to finetune a pretrained BERT model. -
resnet_distributed_torch.yaml
: Run Distributed PyTorch (DDP) training of ResNet50 on 2 nodes. -
detectron2_app.yaml
: Run Detectron2 on a V100 GPU. -
TPU examples
tpu/tpu_app.yaml
: Train on a TPU node on GCP. Finetune BERT on Amazon Reviews for sentiment analysis.tpu/tpuvm_mnist.yaml
: Train on a TPU VM on GCP. Train on MNIST in Flax (based on JAX).
-
resnet_app.py
: ResNet50 training on GPUs, adapted from tensorflow/tpu.The training data is currently a public, "fake_imagenet" dataset (
gs://cloud-tpu-test-datasets/fake_imagenet
, 70GB). -
resnet_distributed_tf_app.py
: Distributed training variant of the above, via TensorFlow Distributed. -
huggingface_glue_imdb_grid_search_app.py
: Grid search: run many trials concurrently on the same VM.
...and many more.
General examples:
-
detectron2_docker.yaml
: Using Docker to run Detectron2 on GPUs. -
using_file_mounts.yaml
: Usingfile_mounts
to upload local/cloud paths to a cluster. -
multi_hostname.yaml
: Run a command on multiple nodes. -
env_check.yaml
: Using environment variables in therun
commands. -
multi_echo.py
: Launch and schedule hundreds of bash commands on the clouds, with configurable resources. Similar to grid search.
...and many more.