-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docs][SkyServe] Autoscaling doc for SkyServe #2989
Conversation
Co-authored-by: Ziming Mao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left a few nits
Co-authored-by: Ziming Mao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Made some changes, PTAL @cblmemo @MaoZiming.
docs/source/serving/autoscaling.rst
Outdated
|
||
# ... | ||
|
||
The service will scale down all replicas when there is no traffic to the system and will save costs on idle replicas. In this case, the scale up will be faster when the system has no replicas: it will **scale up immediately if any traffic detected**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does
In this case, the scale up will be faster when the system has no replicas:
mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means when the service has no replica, user traffic will trigger an immediate scale-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e., the upscale delay is ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. it will immediately change Ntar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about rephrasing: When upscaling from zero, the upscale delay will be ignored in order to bring up the service faster.
@concretevitamin Those changes look great to me! Added a line on how to adjust scaling delays. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM.
docs/source/serving/autoscaling.rst
Outdated
|
||
# ... | ||
|
||
The service will scale down all replicas when there is no traffic to the system and will save costs on idle replicas. In this case, the scale up will be faster when the system has no replicas: it will **scale up immediately if any traffic detected**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about rephrasing: When upscaling from zero, the upscale delay will be ignored in order to bring up the service faster.
Co-authored-by: Zongheng Yang <[email protected]>
Blocked by #2995 .
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh