-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloud native credentials storage #1280
Comments
I have no time for now, and it will likely take weeks before I came up with something intelligible, but this is a topic on which I plan to write a "Universal Kedro Deployment" issue. I think there are some adherence with this #770, but credentials have a lot of specificities indeed. In short my idea is that:
|
Thanks for the issue @nicpayne713 and your thoughts @Galileo-Galilei. I don't know anything about these credentials management systems, but what you both said here all sounds very sensible to me. @nicpayne713 Just to understand better, could you outline in a bit more detail how you're solving this now? As far as I know there's two possible ways of injecting environment variables into credentials in 0.17.x:
Although both the above use environment variables, there's no reason you can't inject arbitrary Python code into both these places to get credentials from elsewhere. So for AWS Secrets, where you don't have access to credentials through environment variables (is that right?) you instead need to inject some code like this. Please let me know if this is correct or if I've missed something here! My immediate concern here is actually that, as it stands in 0.18, method 1 above would no longer be possible since the move of the
... which actually doesn't seem too hacky to me. Does this seem like a reasonable solution to you for the immediate future, or is the whole idea of credentials as a custom dictionary not compatible with the AWS secrets manager for some reason? @Galileo-Galilei when do you write your thoughts up I'll be very interested in reading why a new |
@AntonyMilneQB, I'm not an AWS Secrets Manager expert but based on my experience I see that you can put any JSON as a secret and therefore also store simple key/value pairs. I don't know how injecting env variables would help unless each env variable was a DataSet key or something for a utility to query Secrets Manager and get back what it needs. As for the credentials to get into Secrets Manager, the boto SDK has several places it checks that I don't know the order of but it looks at env variables for Storing credentials as a custom dictionary is definitely fine with Secrets Manager since a secret is just JSON. My quick solution was to implement basically that exact code snippet via what @datajoely suggested in my other older issue #930 . I wrote a small library with a function At first glace I like what @Galileo-Galilei and I'd like to help here how I can! |
…#1280) Signed-off-by: Laurens Vijnck <[email protected]>
Closing this because we're starting to think about how to make it a reality now! Just to keep all the discussion in one place, let's continue in #1646. |
Description
It is frustrating to try and incorporate secure credentials storage mechanisms that are cloud-native such as AWS Secrets Manager with Kedro because of the hard reliance on a
credentials.yml
file. I have worked around this by hijacking the config registration and forcing a function call to get the secrets but it's definitely a hack.Context
I think this change is important from a security standpoint but also for collaboration. I'm on a team of several people and we all share credentials for common data sources. It's hard enough to manage distinct credential files per project (in my space we have dozens of kedro projects and growing, and I have raised an issue on global credentials file here once before), but it's even more difficult to adhere to some security standards (like securing credentials via an enterprise approved storage solution) when the only option is a plain text file in a repo.
Possible Implementation
I think there could be something in the config registration that might make sense and an added key for DataSets that support a
credentials
key already.The text was updated successfully, but these errors were encountered: