Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More thread to use? #52

Open
Badb0yBadb0y opened this issue Feb 10, 2025 · 2 comments
Open

More thread to use? #52

Badb0yBadb0y opened this issue Feb 10, 2025 · 2 comments

Comments

@Badb0yBadb0y
Copy link

Hi,

Is there a way to do the scraping in more thread?

We have 80k buckets, the timeout set to 30mins now for the query and also on haproxy timeout 30 mins but I need to increase now to be able to not timeout.

If there is a way to use more thread, would be good so it can finish faster.

Thank you

@blemmenes
Copy link
Owner

Hi @Badb0yBadb0y,

Thanks for the request. I don't think this is possible however.

While we could use multiple threads on the request side the only way I can think of on how to batch that data would be if the Ceph Admin Ops API supported sending a request with an offset (e.g. bucket offset) so that you could divvy up the GET operations for your 80k buckets.

Looking at the ceph docs there doesn't appear to be the needed capability to do that. I'll think about this more and do some additional digging though.

Thanks,
Berant

@Badb0yBadb0y
Copy link
Author

Thank you very much, I have actually a user who has 64K buckets and most probably that blocks and slows down the query.
The bucket prefix is the same in it like picture-00[1-64k] and have some other user with couple of 1000 of buckets.
Or maybe somehow skip the bucket check and check only the user? Not sure how would be the best solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants