-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core feature] Upgrade Sagemaker integration to use sagemaker-controller #2721
Comments
Thank you for opening your first issue here! 🛠 |
@hajapy thank you for the great find and I think this would be the right path forward. Do you have a sense for the priority of this? how much of a blocker is this? |
It seems like it would open doors in terms of what type of workflows users could author that leverage more of Sagemaker while still using flyte to orchestrate and run parts of the workflows elsewhere. They could more easily create/update endpoints, ensure training jobs are tracked as parts of Experiments, etc. Have a look at all the CRDs it supports: https://github.com/aws-controllers-k8s/sagemaker-controller/tree/main/config/crd/bases |
@hajapy this indeed looks really good. How should we proceed. The community prefers if there are volunteer organizations that will use the plugin so that we can ensure that we are developing for a user and it will remain maintained. Also this is incase you want to develop a backend plugin. Remember it is always possible to write this simply in python and make it a flytekit only plugin, but I do agree a backend plugin for sagemaker makes the most sense. |
By the way, AWS has made the deprecation of the original sagemaker-operator official: End of technical support for sagemaker-operator is slated for Feb 15, 2023 |
Is there any news on this? SageMaker operator for k8s has been deprecated for almost half a year now |
We've officially removed the deprecated sagemaker integration, in its place we're working on a Sagemaker agent. |
Motivation: Why do you think this is important?
Flyte's Sagemaker integration uses the sagemaker-operator-for-k8s which has not seen much activity and feature enhancements since May 25, 2021. The newer sagemaker-controller is more feature rich and appears more actively maintained. Recent comments indicate this is the intended path forward. Upgrading to this newer controller would enable the sagemaker plugin to perform more tasks and provide a more complete API for the existing tasks (eg. associating a TrainingJob with a Sagemaker Experiment).
Goal: What should the final outcome look like, ideally?
Ideally, for an end user, existing workflows that leverage flytekitplugins.awssagemaker should continue to work. For administrators it should be possible to switch from using the existing operator to the new controller. New functionality would require the new controller to work, so it should be somewhat clear as a workflow author if the workflow they are creating is supported by their flyte installation in AWS.
Describe alternatives you've considered
It might be possible to write a more low-level plugin that doesn't rely on an AWS-provided k8s controller/operator, but instead leverages the go aws sdk directly. As that would be a more extensive rewrite and potentially harder to keep up to date, it seems preferable to pick a solution that is less disruptive to the current state. The new sagemaker-controller repo is heavily automated in terms of keeping it up to date with AWS api changes, as part of the AWS Controllers for Kubernetes (ACK) project.
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: