-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sentry Integration causes OOM kill #540
Comments
We decided to focus on the Sentry part first (postponing InfluxDB). Maybe, the OOM kill is related to the buffer of Sentry events awaiting transmission to the relay. Thus, it could be well connected to the high memory threshold experience during network outages. |
WIP Questions to investigate:
We agreed to postpone this issue until the next occurrence. |
According to POSEIDON-7A the issue occurred again. |
You're right, the issue occurred once again. I checked the Sentry Relay and noticed that it was complaining about too little RAM ( |
During the remaining duration of the recent Python course, we only noticed one other occurrence of the issue. For this second occurrence on October 16th, I was able to gather additional insights: Sentry had a (major) outage, causing issues with the delivery of the events upstream. Hence, our Sentry Relay collected all events which couldn't be sent. After some time, when the Sentry relay was hitting the memory limit once again, Poseidon also reported a too high memory usage (but no OOM kill). My suspicion: We only see a high memory usage of the Sentry integration when events aren't forwarded correctly (either on the Poseidon host or the Sentry Relay host causing a backlog). |
Today, we had another occurrence of Poseidon being OOM killed.
Stack Trace
At the same time, Influx also showed erroneous behaviour:
Stack Trace
The text was updated successfully, but these errors were encountered: