Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server does not shutdown fully if net connection is lost. #52

Open
GunnyWaffle opened this issue Oct 27, 2024 · 3 comments
Open

Server does not shutdown fully if net connection is lost. #52

GunnyWaffle opened this issue Oct 27, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@GunnyWaffle
Copy link

Our server had a power outage event, losing internet.
While it was cleanly shutting down on UPS power, it hung indefinitely waiting for the discarpet bot to terminate (or whatever term is apt here).

I suspect that this is a narrow edge case in the shutdown code of the mod, where it doesn't account for not having API access.

Ultimately, no harm was done. The world was saved and database queues emptied securely. I was able to send a SIGINT to kill the server without consequence. Addressing this issue would just help to simplify a secure shutdown when internet is lost.

@replaceitem replaceitem added the bug Something isn't working label Oct 27, 2024
@replaceitem replaceitem self-assigned this Oct 27, 2024
@replaceitem
Copy link
Owner

Will look into it

@replaceitem
Copy link
Owner

So the problem is that javacord's reconnect mechanism repeatedly schedules reconnect attemps, which keep the JVM running. The thread pool those are ran in gets shut down on disconnect when the websocket disconnects. But the websocket already disconnected in this case if the network closed before discarpet disconnects it.

Javacord does account for this by also scheduling a forced shutdown of the thread pool after 1 minute, so in theory one minute after server shutdown, it should close normally. But since you mentioned having to kill it manually sounds like it hung longer than a minute?

I could add some code that if the websocket is already disconnected, it shuts down the thread pool directly, but I'm not certain this wouldn't create issues.

@GunnyWaffle
Copy link
Author

I waited 5 minutes yeah.

This does sound like a risky situation to address. Shutting down "active" thread pools is always a scare. I suppose if the thread pool can be asserted to only contain reconnect attempts which have no side effect otherwise, it should be safe in theory to terminate the pool and leave its work forfeit.

Though I'd sooner want to figure out why the 1 minute back-off timer didn't work. That smells of an issue elsewhere.

In the end do as you see wisest. I'm just musing aloud haha.
Thank you for investigating this trivial problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants