Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unreasonably big memory usage with DeferredRefreshableCredentials #3366

Open
1 task
Veetaha opened this issue Feb 2, 2025 · 3 comments
Open
1 task

Unreasonably big memory usage with DeferredRefreshableCredentials #3366

Veetaha opened this issue Feb 2, 2025 · 3 comments
Assignees
Labels
bug This issue is a confirmed bug. investigating This issue is being investigated and/or work is in progress to resolve the issue. p2 This is a standard priority issue

Comments

@Veetaha
Copy link

Veetaha commented Feb 2, 2025

Describe the bug

When you configure AssumeRole credentials programmatically via DeferredRefreshableCredentials, the process starts using too much memory in the specific code shown in the reproduction steps.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

There should be no memory leaks.

Current Behavior

Memory is leaked and not cleaned up until Session object is unreferenced.

Reproduction Steps

Paste this Python code into a file and replace {YOUR_ACCOUNT_ID} with your AWS account ID and {ROLE_NAME} with the name of the role to assume.

Note that in my real-world case this account ID is dynamic, as my script traverses all AWS accounts in an organization (see additional context at the bottom for details)

from botocore.session import Session
import botocore.session
import botocore.credentials

sts = Session().create_client("sts")

params = {
    "RoleArn": "arn:aws:iam::{YOUR_ACCOUNT_ID}:role/{ROLE_NAME}",
}
refresher = botocore.credentials.create_assume_role_refresher(sts, params)

sessions = []

for i in range(50):
    print(i)

    creds = botocore.credentials.DeferredRefreshableCredentials(
        method="assume-role",
        refresh_using=refresher,
    )

    sess = Session()
    sess._credentials = creds

    sess.create_client("ec2").describe_regions()
    sessions.append(sess)

Run this script and you'll find that the memory usage grows rapidly. Once the script accumulates ~30 sessions, the used-up memory is half a gig.

Demo (with sound 😄):

botocore-mem-leak-demo.mp4

Possible Solution

No response

Additional Information/Context

I'm using botocore to write a script, that lists all accounts in an organization, and then traverses all accounts and their regions to discover all resources present in them. For this, the script uses DeferredRefreshableCredentials to configure AssumeRole credentials dynamically for every discovered account. I haven't found any documentation on how DeferredRefereshableCredentials must be used. There is no official way to configure the credentials provider other than by setting it directly in the Sessions _credentials field. It looks like the official way of doing that via Session.set_credentials requires static credentials, which sucks, so I have to resort to the method described in this issue.

Maybe there is a better way to configure AssumeRole credential provider dynamically in-memory? I'm quite inexperienced with Python and Botocore, but doing such thing in Rust AWS SDK is embarrassingly easy, and I'm surprised it's such a problem in botocore.

SDK version used

1.36.3

Environment details (OS name and version, etc.)

22.04.5 LTS (Jammy Jellyfish)

@Veetaha Veetaha added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Feb 2, 2025
@khushail khushail added investigating This issue is being investigated and/or work is in progress to resolve the issue. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Feb 25, 2025
@khushail khushail self-assigned this Feb 25, 2025
@khushail
Copy link

khushail commented Feb 25, 2025

Hi @Veetaha , thanks for reaching out. I might not be an expert in DeferredRefreshableCredentials but IMO, I see that you are not releasing the session objects which are piling up and consuming the resources.

sessions.append(sess)

However I am working on reproducing this in my account. will share updates soon.

@khushail
Copy link

khushail commented Feb 25, 2025

@Veetaha , I am also not able to find any documentation on it usage of DeferredRefreshableCredentials but any particular reason why you are using this ? I noticed this RefreshableCredentials in Botocore code which is doing the same thing of refreshing credentials dynamically as I udnerstand it -

class RefreshableCredentials(Credentials):

Found some other online useful articles to use the RefreshableCredentials -

  1. https://dev.to/li_chastina/auto-refresh-aws-tokens-using-iam-role-and-boto3-2cjf
  2. https://pritul95.github.io/blogs/boto3/2020/08/01/refreshable-boto3-session/
  3. https://stackoverflow.com/questions/63724485/how-to-refresh-the-boto3-credentials-when-python-script-is-running-indefinitely.
  4. https://repost.aws/questions/QU-tAtxo2uQp-10bHXHccyTg/python-boto3-auto-refresh-credentials-when-assuming-role

Let me know if this helps. I would also reach out to team for the insights on the workability of DeferredRefreshableCredentials.

@khushail khushail added the response-requested Waiting on additional info and feedback. label Feb 25, 2025
@Veetaha
Copy link
Author

Veetaha commented Feb 25, 2025

Hi, the DeferredRefreshableCredentials class is actually a subclass of RefreshableCredentials, so I'm using it since it provides shorter syntax to create a refreshable assume-role session that is lazily initialized.

I see that you are not releasing the session objects which are piling up and consuming the resources.

Yes, this is the intention. I'm using it in a script that does a lot of parallel requests into different accounts. Namely here is my script. So I create a separate session object for each AWS account my script needs to traverse.

My main complaint here is the amount of memory every such session with DeferredRefreshableCredentials allocates. If you hold 30 sessions at once it already eats up 0.5GB of memory. I don't understand how in the world 30 Python objects that are supposed to hold mere aws_access_key_id, aws_secret_access_key, aws_session_token and the expiry timestamp can take up half a gig!

@Veetaha Veetaha changed the title Memory leak related to DeferredRefreshableCredentials Unreasonably big memory usage with DeferredRefreshableCredentials Feb 25, 2025
@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. label Feb 26, 2025
@khushail khushail added needs-review This issue or pull request needs review from a core team member. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Feb 26, 2025
@RyanFitzSimmonsAK RyanFitzSimmonsAK added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed needs-review This issue or pull request needs review from a core team member. labels Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. investigating This issue is being investigated and/or work is in progress to resolve the issue. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants