Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Add docs for run many jobs #3847

Merged
merged 49 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
50ad6ef
Add docs for running N jobs
Michaelvll Aug 20, 2024
4254044
Fix language
Michaelvll Aug 20, 2024
19f294f
Fix job queue
Michaelvll Aug 20, 2024
64446e3
Add link to managed jobs
Michaelvll Aug 20, 2024
3e80dab
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
c21084a
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
2eeb7d3
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
772dbc3
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
7dace8c
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
8059c83
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
d580119
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
013d436
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
5e6b644
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
0144b96
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
23c4674
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
3c5b8ac
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
71ab53f
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
d01d5ad
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
ab5110e
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
2de0066
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
ee7218e
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
b096d90
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 28, 2024
4a70534
Add script for generating config files
Michaelvll Aug 28, 2024
4f90f24
Fix comments
Michaelvll Aug 28, 2024
1af226d
fix title
Michaelvll Aug 28, 2024
2837cad
fix title
Michaelvll Aug 29, 2024
e31d814
fix
Michaelvll Aug 29, 2024
86f7292
reduce image size
Michaelvll Aug 29, 2024
acb5690
restructure
Michaelvll Aug 29, 2024
a720ab2
rename
Michaelvll Aug 29, 2024
30b558f
adopt comments
Michaelvll Aug 29, 2024
512627c
Add benefits
Michaelvll Aug 29, 2024
eb50f26
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
f8f3eb7
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
48d6f80
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
d16bab6
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
e13f72e
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
ce8730b
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
8ba7a88
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
5afc752
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
516fdb1
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
39ca239
update
Michaelvll Aug 29, 2024
089ac31
Merge branch 'docs-scale-up-jobs' of github.com:skypilot-org/skypilot…
Michaelvll Aug 29, 2024
2440d90
rename
Michaelvll Aug 29, 2024
9b36c6b
fix
Michaelvll Aug 29, 2024
8651fac
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
0f02e4b
Update docs/source/running-jobs/many-jobs.rst
Michaelvll Aug 29, 2024
83e2b5c
Minor fix for comments
Michaelvll Aug 29, 2024
c071f68
Merge branch 'docs-scale-up-jobs' of github.com:skypilot-org/skypilot…
Michaelvll Aug 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/_static/custom.js
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ document.addEventListener('DOMContentLoaded', () => {
const newItems = [
{ selector: '.caption-text', text: 'SkyServe: Model Serving' },
{ selector: '.toctree-l1 > a', text: 'Managed Jobs' },
{ selector: '.toctree-l1 > a', text: 'Running on Kubernetes' },
{ selector: '.toctree-l1 > a', text: 'Llama-3.1 (Meta)' },
{ selector: '.toctree-l1 > a', text: 'Many Parallel Jobs' },
];
newItems.forEach(({ selector, text }) => {
document.querySelectorAll(selector).forEach((el) => {
Expand Down
8 changes: 8 additions & 0 deletions docs/source/developers/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Developer Guides
=================

.. toctree::
:maxdepth: 1

../developers/CONTRIBUTING
Guide: Adding a New Cloud <https://docs.google.com/document/d/1oWox3qb3Kz3wXXSGg9ZJWwijoa99a3PIQUHBR8UgEGs/edit?usp=sharing>
12 changes: 3 additions & 9 deletions docs/source/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,8 @@ Read the research:

../getting-started/installation
../getting-started/quickstart
../getting-started/tutorial
../examples/interactive-development
../getting-started/tutorial


.. toctree::
Expand All @@ -143,6 +143,7 @@ Read the research:
../examples/auto-failover
../reference/kubernetes/index
../running-jobs/distributed-jobs
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
../running-jobs/many-jobs

.. toctree::
:hidden:
Expand Down Expand Up @@ -184,14 +185,6 @@ Read the research:
SkyPilot vs. Other Systems <../reference/comparison>


.. toctree::
:hidden:
:maxdepth: 1
:caption: Developer Guides

../developers/CONTRIBUTING
Guide: Adding a New Cloud <https://docs.google.com/document/d/1oWox3qb3Kz3wXXSGg9ZJWwijoa99a3PIQUHBR8UgEGs/edit?usp=sharing>

.. toctree::
:hidden:
:maxdepth: 1
Expand All @@ -210,4 +203,5 @@ Read the research:
../reference/cli
../reference/api
../reference/config
../developers/index

2 changes: 1 addition & 1 deletion docs/source/getting-started/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ Congratulations! In this quickstart, you have launched a cluster, run a task, a

Next steps:

- Adapt :ref:`Tutorial: DNN Training <dnn-training>` to start running your own project on SkyPilot!
- Adapt :ref:`Tutorial: AI Training <ai-training>` to start running your own project on SkyPilot!
- See the :ref:`Task YAML reference <yaml-spec>`, :ref:`CLI reference <cli>`, and `more examples <https://github.com/skypilot-org/skypilot/tree/master/examples>`_
- To learn more, try out `SkyPilot Tutorials <https://github.com/skypilot-org/skypilot-tutorial>`_ in Jupyter notebooks

Expand Down
4 changes: 2 additions & 2 deletions docs/source/getting-started/tutorial.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _dnn-training:
.. _ai-training:

Tutorial: DNN Training
Tutorial: AI Training
======================
This example uses SkyPilot to train a Transformer-based language model from HuggingFace.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/reference/job-queue.rst
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ SkyPilot's scheduler serves two goals:
2. **Minimizing resource idleness**: If a resource is idle, SkyPilot will schedule a
queued job that can utilize that resource.

We illustrate the scheduling behavior by revisiting :ref:`Tutorial: DNN Training <dnn-training>`.
We illustrate the scheduling behavior by revisiting :ref:`Tutorial: AI Training <ai-training>`.
In that tutorial, we have a task YAML that specifies these resource requirements:

.. code-block:: yaml
Expand Down
2 changes: 1 addition & 1 deletion docs/source/running-jobs/distributed-jobs.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _dist-jobs:

Distributed Jobs on Many Nodes
Distributed Multi-Node Jobs
================================================

SkyPilot supports multi-node cluster
Expand Down
Loading
Loading