-
Notifications
You must be signed in to change notification settings - Fork 954
Open
Description
We're looking to migrate away from a custom StepFunctions orchestration of spinning up SageMaker training jobs to using Metaflow, as that leads to much more digestible & easier to change code.
We like SageMaker training for some nice features they provide (spot interruption / artifact handling / metrics etc.) and migrating those would be too costly for now.
AWS now supports triggering SageMaker jobs from an AWS Batch queue. However, submitting jobs must use a separate SubmitServiceJob API (vs the SubmitJob
used for Fargate/EC2 jobs). The payload the API expects is also basically the usual SageMaker's CreateTrainingJob
input.
Metadata
Metadata
Assignees
Labels
No labels