Skip to content

Commit

Permalink
[Docs] Add docs for run many jobs (#3847)
Browse files Browse the repository at this point in the history
* Add docs for running N jobs

* Fix language

* Fix job queue

* Add link to managed jobs

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Add script for generating config files

* Fix comments

* fix title

* fix title

* fix

* reduce image size

* restructure

* rename

* adopt comments

* Add benefits

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* update

* rename

* fix

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Update docs/source/running-jobs/many-jobs.rst

Co-authored-by: Zongheng Yang <[email protected]>

* Minor fix for comments

---------

Co-authored-by: Zongheng Yang <[email protected]>
  • Loading branch information
Michaelvll and concretevitamin authored Aug 29, 2024
1 parent bd40b93 commit 95b52c0
Show file tree
Hide file tree
Showing 8 changed files with 363 additions and 15 deletions.
2 changes: 1 addition & 1 deletion docs/source/_static/custom.js
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ document.addEventListener('DOMContentLoaded', () => {
const newItems = [
{ selector: '.caption-text', text: 'SkyServe: Model Serving' },
{ selector: '.toctree-l1 > a', text: 'Managed Jobs' },
{ selector: '.toctree-l1 > a', text: 'Running on Kubernetes' },
{ selector: '.toctree-l1 > a', text: 'Llama-3.1 (Meta)' },
{ selector: '.toctree-l1 > a', text: 'Many Parallel Jobs' },
];
newItems.forEach(({ selector, text }) => {
document.querySelectorAll(selector).forEach((el) => {
Expand Down
8 changes: 8 additions & 0 deletions docs/source/developers/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Developer Guides
=================

.. toctree::
:maxdepth: 1

../developers/CONTRIBUTING
Guide: Adding a New Cloud <https://docs.google.com/document/d/1oWox3qb3Kz3wXXSGg9ZJWwijoa99a3PIQUHBR8UgEGs/edit?usp=sharing>
12 changes: 3 additions & 9 deletions docs/source/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,8 @@ Read the research:

../getting-started/installation
../getting-started/quickstart
../getting-started/tutorial
../examples/interactive-development
../getting-started/tutorial


.. toctree::
Expand All @@ -143,6 +143,7 @@ Read the research:
../examples/auto-failover
../reference/kubernetes/index
../running-jobs/distributed-jobs
../running-jobs/many-jobs

.. toctree::
:hidden:
Expand Down Expand Up @@ -184,14 +185,6 @@ Read the research:
SkyPilot vs. Other Systems <../reference/comparison>


.. toctree::
:hidden:
:maxdepth: 1
:caption: Developer Guides

../developers/CONTRIBUTING
Guide: Adding a New Cloud <https://docs.google.com/document/d/1oWox3qb3Kz3wXXSGg9ZJWwijoa99a3PIQUHBR8UgEGs/edit?usp=sharing>

.. toctree::
:hidden:
:maxdepth: 1
Expand All @@ -210,4 +203,5 @@ Read the research:
../reference/cli
../reference/api
../reference/config
../developers/index

2 changes: 1 addition & 1 deletion docs/source/getting-started/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ Congratulations! In this quickstart, you have launched a cluster, run a task, a

Next steps:

- Adapt :ref:`Tutorial: DNN Training <dnn-training>` to start running your own project on SkyPilot!
- Adapt :ref:`Tutorial: AI Training <ai-training>` to start running your own project on SkyPilot!
- See the :ref:`Task YAML reference <yaml-spec>`, :ref:`CLI reference <cli>`, and `more examples <https://github.com/skypilot-org/skypilot/tree/master/examples>`_
- To learn more, try out `SkyPilot Tutorials <https://github.com/skypilot-org/skypilot-tutorial>`_ in Jupyter notebooks

Expand Down
4 changes: 2 additions & 2 deletions docs/source/getting-started/tutorial.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _dnn-training:
.. _ai-training:

Tutorial: DNN Training
Tutorial: AI Training
======================
This example uses SkyPilot to train a Transformer-based language model from HuggingFace.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/reference/job-queue.rst
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ SkyPilot's scheduler serves two goals:
2. **Minimizing resource idleness**: If a resource is idle, SkyPilot will schedule a
queued job that can utilize that resource.

We illustrate the scheduling behavior by revisiting :ref:`Tutorial: DNN Training <dnn-training>`.
We illustrate the scheduling behavior by revisiting :ref:`Tutorial: AI Training <ai-training>`.
In that tutorial, we have a task YAML that specifies these resource requirements:

.. code-block:: yaml
Expand Down
2 changes: 1 addition & 1 deletion docs/source/running-jobs/distributed-jobs.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _dist-jobs:

Distributed Jobs on Many Nodes
Distributed Multi-Node Jobs
================================================

SkyPilot supports multi-node cluster
Expand Down
Loading

0 comments on commit 95b52c0

Please sign in to comment.