-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU Leak #247
Comments
Running into the same issue. This issue can be observed if you have a long running service like a microservice that uses oci-sdk. The CPU usage in my case was increasing about 0.1% every 10 minutes, as measured by "ps -p <pid> -o %cpu,%mem" (on a 2 CPU host) seemingly indefinitely. This growth happens despite the service not making any OCI calls. The sdk probably starts some timer that constantly runs on the background? Previous to this I was calling OCI APIs directly, and there was near 0% CPU usage by the server. After the switch to oci-sdk, CPU always goes to 100% after 2 days or so. I just now removed the oci-sdk from the service, and CPU usage pattern became normal, consuming only 0.x% generally and not increasing. oci-sdk version used: 2.7.0.3 |
Thanks for reporting this @aaronkvanmeerten . We are working on the fix internally and will update here once its fixed. |
Setting the environment variable OCI_SDK_DEFAULT_CIRCUITBREAKER_ENABLED=false is a workaround to avoid this issue for now. |
Hi @vpeltola, this issue seems to be caused by circuit breakers not shutting down after they are no longer needed. The most recent release of the SDK includes a method in each client that the user can call to shut down these circuit breakers as needed. Please see this example. Let us know if this seems to fix your issue, thanks! |
Hmm, the solution shouldn't be to shutdown something (circuit breakers) that I didn't start in the first place. If they were automatically started without the user's knowledge, they should also shutdown automatically. And if/while they are running, they should not leak memory and use progressively more CPU. I think there is still a bug that needs fixing. |
I agree completely with the sentiment. No other library I have ever used has required me to run extra code to shut down pieces in order to not leak CPU. Something is clearly wrong in this library. Especially because it did not happen before a certain version, I believe it must be some kind of bug that needs fixing. |
Hi @vpeltola @aaronkvanmeerten, thank you for your feedback. |
As part of the latest Typescript release, .close() has been added to each client to further address this issue, and to more closely resemble the behavior of clients from the Java SDK. In addition, this method's use is now shown in each of the typescript examples that use clients. |
When using oci-sdk versions greater than ^1.5.2, including the latest, we are seeing a slow but steady increase of CPU utilization, which eventually grows out of bounds and uses all available CPU on the instance. Reverting to ^1.5.2 fixes the issue for us. This occurs across multiple projects that leverage the oci-sdk, and can be directly attributed to the oci-sdk version, as an identical version with the older ^1.5.2 does not exhibit the leak behavior. A user of the jitsi-autoscaler (which uses oci-sdk) reported this leak to us and ran a profile, and we have confirmed the behavior ourselves but not done the profiling.
The users shows that the system is overwhelmed by timers, in case that helps you debug:
46492.8 ms43.13 % | 97726.0 ms90.67 % | (anonymous) status.js:82 |
46492.8 ms43.13 % | 97726.0 ms90.67 % | ........listOnTimeout internal/timers.js:502 |
46492.8 ms43.13 % | 97726.0 ms90.67 % | ...............processTimers internal/timers.js:482 |
43874.9 ms40.71 % | 43874.9 ms40.71 % | (anonymous) status.js:96 |
43874.9 ms40.71 % | 43874.9 ms40.71 % | ........(anonymous) status.js:94 |
43874.9 ms40.71 % | 43874.9 ms40.71 % | ...............get stats status.js:93 |
43874.9 ms40.71 % | 43874.9 ms40.71 % | ......................(anonymous) status.js:82 |
43874.9 ms40.71 % | 43874.9 ms40.71 % | .............................listOnTimeout internal/timers.js:502 |
43874.9 ms40.71 % | 43874.9 ms40.71 % | ..................................processTimers internal/timers.js:482 |
4877.3 ms4.53 % | 5978.8 ms5.55 % | (anonymous) status.js:124 |
4877.3 ms4.53 % | 5978.8 ms5.55 % | .......get stats status.js:93 |
4877.3 ms4.53 % | 5978.8 ms5.55 % | ...............(anonymous) status.js:82 |
4877.3 ms4.53 % | 5978.8 ms5.55 % | ......................listOnTimeout internal/timers.js:502 |
4877.3 ms4.53 % | 5978.8 ms5.55 % | .............................processTimers internal/timers.js:502 |
The text was updated successfully, but these errors were encountered: