Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix WS disconnection handling #30

Merged
merged 7 commits into from
May 29, 2024
Merged

Fix WS disconnection handling #30

merged 7 commits into from
May 29, 2024

Conversation

ziopio
Copy link
Member

@ziopio ziopio commented May 22, 2024

The following reconnections scenarios are handled in this PR

  • The server closes the connection, gun_down is handled correctly
  • The gun process crashes, the monitor message is used to react to it
  • Ping timeout, if the server is not pinging, the client throws away the connections and re-attempts it.

This PR requires stritzinger/kraft#3 to go in production

- Demonitor the gun process before shutting it down through gun.
- Handle disconnections detected by monitor ('DOWN' monitor message)
@ziopio ziopio requested a review from maehjam May 22, 2024 15:50
@ziopio ziopio marked this pull request as ready for review May 22, 2024 15:50
@ziopio ziopio requested a review from GwendalLaurent May 24, 2024 07:52
ziopio added 3 commits May 24, 2024 18:07
A new option ws_ping_timeout enables to set the time to wait for the ws pings coming from the server.
jsx:is_term/1 appears to crash when it tries to check upon tuples
@ziopio ziopio marked this pull request as draft May 24, 2024 16:15
@ziopio ziopio force-pushed the SEA-189-fix-disconnections branch from 31ae2ec to 340acc0 Compare May 27, 2024 08:08
@ziopio ziopio marked this pull request as ready for review May 27, 2024 11:03
Copy link
Member

@maehjam maehjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. What would also be important to test, but maybe only manually is how grisp_connect handles a longer disconnect or a random sequence of different kinds of disconnects. Do we get some queue up of gun instances for example? I can't think of a test case, but manually we should just unplug and replug the ethernet cable a lot to test the handling of this kind of disconnect as well. (Is it the same as crashing gun? I don't think so.)

@maehjam
Copy link
Member

maehjam commented May 28, 2024

Oh the previous comment only refers to the tests. I missed the commits before. Wait for another approval!

Copy link
Member

@maehjam maehjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments, rest looks good.

src/grisp_connect_log_server.erl Outdated Show resolved Hide resolved
src/grisp_connect_ws.erl Outdated Show resolved Hide resolved
@ziopio
Copy link
Member Author

ziopio commented May 28, 2024

Looks good. What would also be important to test, but maybe only manually is how grisp_connect handles a longer disconnect or a random sequence of different kinds of disconnects. Do we get some queue up of gun instances for example? I can't think of a test case, but manually we should just unplug and replug the ethernet cable a lot to test the handling of this kind of disconnect as well. (Is it the same as crashing gun? I don't think so.)

Quick or Long network interrruptions are unhandled (undetected) by gun. So an API call will just timeout but gun will not generate errors. That is why the ws_ping_timeout I am introducing in this PR is so important. It forces gun to reconnect.

@ziopio ziopio merged commit e97e4dd into main May 29, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants