From fc04be7eea74be18b7d26a853a3ca41059b46c9a Mon Sep 17 00:00:00 2001 From: Tian Xia Date: Thu, 18 Jan 2024 17:30:07 +0800 Subject: [PATCH] Apply suggestions from code review Co-authored-by: Ziming Mao --- docs/source/serving/autoscaling.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/serving/autoscaling.rst b/docs/source/serving/autoscaling.rst index 6765010efe5..2539f5caeb5 100644 --- a/docs/source/serving/autoscaling.rst +++ b/docs/source/serving/autoscaling.rst @@ -46,7 +46,7 @@ In this example, SkyServe will launch 2 replicas of your service and scale up to .. tip:: - :code:`target_qps_per_replica` could be any positive floating point number. If process one request takes two seconds in one replica, using :code:`target_qps_per_replica=0.5`. + :code:`target_qps_per_replica` could be any positive floating point number. If processing one request takes two seconds in one replica, we can use :code:`target_qps_per_replica=0.5`. Scaling Delay ------------- @@ -70,7 +70,7 @@ SkyServe will not scale up or down immediately. Instead, SkyServe will wait for Scale Down to 0 =============== -If your service has a consecutive time period with no traffic, consider using :code:`min_replicas=0`: +If your service might experience long period of time with no traffic, consider using :code:`min_replicas=0`: .. code-block:: yaml :emphasize-lines: 4