-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leaking threads (and low performance) in docker image #91
Comments
Hello @karatekaneen ! This is a problem of the docker image - it requires an init to reap zombies (e.g. adding The project has a dockerfile but it is not updated and supported, nor documented (I never tested it actually!). I really recommend not using it for the moment and to use a normal build for avoiding this issues and get a good performance. I will try to set-up a working docker build in the next version. |
@kermitt2 Thank you for your response. What changes are needed to the Dockerfile to get it up to date again? I'm happy to contribute but unfortunately I'm not a (good enough) Java dev to figure out what's out of sync. So with some guidance I'm probably able to figure it out |
Well, I have not worked on this docker image, but apparently there's no init included (like tiny), which is necessary to close properly the process. So either you can start the container by passing The good news: it's only about docker settings, no java dev. needed. |
@kermitt2 Added tini in #90 which is the version we are running in production at the moment. It still has 144 threads running so I'm not sure I did everything right but it's way better than the >2500 that the first version had running after a month uptime. So I'm going to check in on it in a couple of days to see how it looks. |
Unfortunately, the fix did not help. Just had a look at the service and it had 2000+ threads running with max set to 128 in the config. |
The thread in the config is for managing the server parallel requests, it's likely that the remaining threads are zombies. I would suggest again not to use the docker image at this point for biblio-glutton :) If it helps, it was how we used tini in Grobid (before using the --init parameter): https://github.com/kermitt2/grobid/blob/0.7.1/Dockerfile.crf#L72 |
Hi,
I'm running the latest (master) version in Docker. We've allocated a VM with 4 cores and 32gb RAM for Glutton.
When running a batch of lookups (started with about 200k) by their bibliographic strings we can only process about 70-80 strings per minute.
The elasticsearch cluster is almost running idle so when I looked closer at the container I saw that the threads kept increasing. It looks like one thread is added for every citation processed and none is ever removed.
Edit:
Forgot to mention that this does not respect
maxThreads
in the config.yml. I've tried running it both with the default value of 2048 and higher as well as lower. The performance is still low and the threads keeps on ticking upThe text was updated successfully, but these errors were encountered: