-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #800 - Invalid Command Response on Database Outage #1009
Issue #800 - Invalid Command Response on Database Outage #1009
Conversation
Awesome, thanks for the PR! Sorry if it takes me a little while to get to reviewing it properly, but at first glance it looks great! |
No worries at all! Just wanted to give you a heads up. I'll try to look at some of the other open issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for you patience. Sorry it took me a while to get to this. I finally replaced the battery on my laptop so now I can use it without having to be plugged in all the time which makes me significantly more likely actually open it up haha.
@@ -216,6 +216,15 @@ async def on_message_received(self, message): | |||
self.get_user_identifier(), | |||
cmd | |||
) | |||
except OperationalError: | |||
# When the database goes down, SqlAlchemy will throw an OperationalError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we still want to log the exception here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can go ahead and add logging here. I was avoiding it only because if the DB is truly down, it could cause a lot of logs to the logging service to be ingested due to failed commands. But if we feel like the logging service can handle that volume without issue, I have no qualms about adding it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good question. I'm not sure it's every happened before, but I think it's safer to have it log something even if it ends up spamming a lot of messages than to have it error silently and require users to start complaining before we can see that there's an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too easy. I've added that logging in the most recent version.
"style": "error", | ||
"text": "Unable to connect to database. Please try again later." | ||
}) | ||
# Make sure to abort here to avoid a thundering herd problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to use the kick
style to prevent the client from trying to automatically reconnect. @Sheikah45?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that's what https://github.com/FAForever/server/pull/1009/files#diff-de94292bdcec0d8f98b8eb76040f97830866bed0c1fb5b55165a9e54f175ea0bR227 would do for the client, but if that is not the case, we for sure need to send the client a "Stop and don't retry" message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, there's currently no good system for telling the client that it should not attempt to reconnect. I believe the kick
style message is all we have. It might be cleaner to add a dedicated message for that but that would be for another PR and also require client changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think https://github.com/FAForever/downlords-faf-client/blob/ca8554d8767e748498dee37caa39cf3f657284a6/src/main/java/com/faforever/client/remote/FafServerAccessor.java#L115 could be an issue since the logic is if we were connected and suddenly were disconnected, retry the login and connect. Net new connections should be okay since it will be going from connecting to disconnected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main problem with the kick style is that it will also close the client. If this is desired behavior then sure but otherwise it shouldn't be kick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sheikah45 @Askaholic I think this should work for what we want then. This follows the same pattern as if someone were banned where it sends a message and aborts the connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, we'll leave it like this then.
Closes #800.