-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Bot stops working for seemingly no reason (socket connection memory leaks?) #384
Comments
Would love some help in debugging this and figuring out how we can properly disconnect the old instances of needle when pm2 restarts. A first step of figuring out if this is happening could be to implement #299. |
@nchristopher mentioned that we should listen to Could be because my development machine was Windows and then that falls under other flags (https://pm2.keymetrics.io/docs/usage/signals-clean-restart/#windows-graceful-stop) |
It may also be rate limit related? It seems to be happening more frequently now. It can also happen on specific shards even though other shards are fine, which seems to indicate that it's not related to rate limits. |
Another idea I had was that it may be Discord just being fed up with the amount of API errors we are getting, probably around 1 every 5 seconds - most are related to #308 |
It should definitely not be rate limit, because I seem to be creating less than 1 thread every 5 seconds (seems very low for the server count!). I just removed some of the Discord errors, so we'll see if that improves uptime. I also have more logs now that hopefully reveal more information when Needle crashes. |
This has not happened again after https://github.com/MarcusOtter/discord-needle/releases/tag/v3.3.0 which is 2 months of "uptime". The bot still crashes every week or so, but fully (which means pm2 restarts it automatically and all is well). I don't know exactly what happened but I think Discord.js changed some implementation in their sockets so maybe that could be it. Either way, I will close this issue and re-open if it happens again. (Can't reproduce anymore) |
Describe the bug
Every so often (everything from 1 day between 1 month) the bot will randomly stop working. It is running in https://github.com/Unitech/pm2 so it's not the entire process that crashes (as that would automatically restart the bot) - the bot just isn't connected to Discord anymore and goes offline. The process is still running and there is nothing in the logs - which could be because we're not listening to the right events (see #299).
My theory is that this has to do with pm2 not sending the proper termination signals when restarting the bot, which I guess might leave some socket connections or something open and eventually exahust all the available connections -> eventually we cannot connect. When I was setting up pm2 I had issues with trying to see the "Destroyed client" message that we log when the bot shuts down, I just couldn't see it happen ever. I tried to fix it with 4c28360 but I don't think it worked. At least in the logs, I don't see the "Destroyed client" message that I see when running it normally in node, for example.
This theory is further supported by that it takes an unspecified amount of time to happen. But last time it happened, I just did
pm2 restart needle
, and then it happened almost immediately again after 24h (i.e., perhaps restarting with pm2 does not clear socket connections like it should?). When I manually didpm2 stop needle
andpm2 start needle
instead of arestart
, it worked fine again for 14-31 days.Steps to reproduce the bug
pm2 restart needle
pm2 stop needle
pm2 start needle
Expected behavior
No downtime - we shouldn't have to manually stop and start the bot every so often.
The text was updated successfully, but these errors were encountered: