Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Upgrade Sagemaker integration to use sagemaker-controller #2721

Closed
2 tasks done
hajapy opened this issue Jul 26, 2022 · 8 comments
Closed
2 tasks done
Labels
enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers

Comments

@hajapy
Copy link

hajapy commented Jul 26, 2022

Motivation: Why do you think this is important?

Flyte's Sagemaker integration uses the sagemaker-operator-for-k8s which has not seen much activity and feature enhancements since May 25, 2021. The newer sagemaker-controller is more feature rich and appears more actively maintained. Recent comments indicate this is the intended path forward. Upgrading to this newer controller would enable the sagemaker plugin to perform more tasks and provide a more complete API for the existing tasks (eg. associating a TrainingJob with a Sagemaker Experiment).

Goal: What should the final outcome look like, ideally?

Ideally, for an end user, existing workflows that leverage flytekitplugins.awssagemaker should continue to work. For administrators it should be possible to switch from using the existing operator to the new controller. New functionality would require the new controller to work, so it should be somewhat clear as a workflow author if the workflow they are creating is supported by their flyte installation in AWS.

Describe alternatives you've considered

It might be possible to write a more low-level plugin that doesn't rely on an AWS-provided k8s controller/operator, but instead leverages the go aws sdk directly. As that would be a more extensive rewrite and potentially harder to keep up to date, it seems preferable to pick a solution that is less disruptive to the current state. The new sagemaker-controller repo is heavily automated in terms of keeping it up to date with AWS api changes, as part of the AWS Controllers for Kubernetes (ACK) project.

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@hajapy hajapy added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Jul 26, 2022
@welcome
Copy link

welcome bot commented Jul 26, 2022

Thank you for opening your first issue here! 🛠

@kumare3
Copy link
Contributor

kumare3 commented Jul 26, 2022

@hajapy thank you for the great find and I think this would be the right path forward. Do you have a sense for the priority of this? how much of a blocker is this?

@hajapy
Copy link
Author

hajapy commented Jul 26, 2022

It seems like it would open doors in terms of what type of workflows users could author that leverage more of Sagemaker while still using flyte to orchestrate and run parts of the workflows elsewhere. They could more easily create/update endpoints, ensure training jobs are tracked as parts of Experiments, etc.

Have a look at all the CRDs it supports: https://github.com/aws-controllers-k8s/sagemaker-controller/tree/main/config/crd/bases

@kumare3
Copy link
Contributor

kumare3 commented Aug 1, 2022

@hajapy this indeed looks really good. How should we proceed. The community prefers if there are volunteer organizations that will use the plugin so that we can ensure that we are developing for a user and it will remain maintained. Also this is incase you want to develop a backend plugin. Remember it is always possible to write this simply in python and make it a flytekit only plugin, but I do agree a backend plugin for sagemaker makes the most sense.

@hajapy
Copy link
Author

hajapy commented Jan 17, 2023

By the way, AWS has made the deprecation of the original sagemaker-operator official:
https://docs.aws.amazon.com/sagemaker/latest/dg/kubernetes-sagemaker-operators-eos-announcement.html

End of technical support for sagemaker-operator is slated for Feb 15, 2023

@kumare3
Copy link
Contributor

kumare3 commented Jan 23, 2023

@hajapy thank you for the heads up. I think the new RFC might be a better approach. Cc @pingsutw

@j-adamczyk
Copy link

Is there any news on this? SageMaker operator for k8s has been deprecated for almost half a year now

@eapolinario
Copy link
Contributor

We've officially removed the deprecated sagemaker integration, in its place we're working on a Sagemaker agent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers
Projects
None yet
Development

No branches or pull requests

4 participants