Skip to content

Commit

Permalink
Remove experimental hot spare policy
Browse files Browse the repository at this point in the history
Differential Revision: D61746435

Pull Request resolved: #948
  • Loading branch information
manav-a authored Aug 26, 2024
1 parent bfce4bd commit 69129eb
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 10 deletions.
9 changes: 1 addition & 8 deletions torchx/specs/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -237,17 +237,12 @@ class RetryPolicy(str, Enum):
application to deal with failed replica departures and
replacement replica admittance.
2. APPLICATION: Restarts the entire application.
3. HOT_SPARE: Restarts the replicas for a role as long as quorum (min_replicas)
is not violated using extra hosts as spares. It does not really support
elasticity and just uses the delta between num_replicas and min_replicas
as spares (EXPERIMENTAL).
4. ROLE: Restarts the role when any error occurs in that role. This does not
3. ROLE: Restarts the role when any error occurs in that role. This does not
restart the whole job.
"""

REPLICA = "REPLICA"
APPLICATION = "APPLICATION"
HOT_SPARE = "HOT_SPARE"
ROLE = "ROLE"


Expand Down Expand Up @@ -347,8 +342,6 @@ class Role:
and num_replicas depending on the cluster resources and
policies. If the scheduler doesn't support auto scaling this
field is ignored and the job size will be num_replicas.
EXPERIMENTAL: For HOT_SPARE restart policy this field is used to
indicate the quorum required for the job to run.
max_retries: max number of retries before giving up
retry_policy: retry behavior upon replica failures
resource: Resource requirement for the role. The role should be scheduled
Expand Down
3 changes: 1 addition & 2 deletions torchx/specs/test/api_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -273,7 +273,6 @@ def test_retry_policies(self) -> None:
RetryPolicy.APPLICATION,
RetryPolicy.REPLICA,
RetryPolicy.ROLE,
RetryPolicy.HOT_SPARE,
},
)

Expand Down Expand Up @@ -494,7 +493,7 @@ def test_resolve_from_str(self) -> None:
"foo=bar,test_key=test_value,default_time=42,enable=True,disable=False,complex_list=v1;v2;v3"
)
),
),
)

def test_config_from_json_repr(self) -> None:
opts = runopts()
Expand Down

0 comments on commit 69129eb

Please sign in to comment.