Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dont pong when receiving ping #73

Closed
ro0NL opened this issue Mar 20, 2024 · 7 comments
Closed

dont pong when receiving ping #73

ro0NL opened this issue Mar 20, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@ro0NL
Copy link
Contributor

ro0NL commented Mar 20, 2024

after #68, the client logic is still off it seems

14:13:40 DEBUG     [messenger] send HPUB test _INBOX.7748dd3c7af163d014b9648cd8389412 91 351
14:13:42 DEBUG     [messenger] receive PING
14:13:42 DEBUG     [messenger] send PONG
14:13:42 DEBUG     [messenger] sleep
14:13:45 DEBUG     [messenger] send PING
14:13:45 ERROR     [messenger] Socket read timeout

im not able to reproduce it locally

but generally i wonder why there's a pong side effect

This was referenced Mar 20, 2024
@paolobarbolini
Copy link

I'm not familiar with this codebase but here's how it works in the official Rust client nats.rs/async-nats, in case it helps:

  1. If PING is received reply immediately with PONG
  2. Every 10 seconds send a PING
  3. Everytime we send or receive a command we reset the ping interval from step 2
  4. When trying to send a new PING, if more than 2 PINGs had already been sent without first receiving a PONG reply consider the connection to be dead

@ro0NL
Copy link
Contributor Author

ro0NL commented Mar 21, 2024

we have the same issue on staging server where all queues are empty and zero traffic occurs

given a 2m timeout and pingInterval=PHP_INT_MAX:

11:36:55 DEBUG     [messenger] send HPUB test _INBOX.49fa8b6b4f94fb86bb8eee365382611b 91 351
11:36:57 DEBUG     [messenger] receive PING
11:36:57 DEBUG     [messenger] send PONG
11:36:57 DEBUG     [messenger] sleep
[
  "max" => 1711017535.2638,
  "now" => 1711017417.6272
]
11:38:57 DEBUG     [messenger] receive PING
11:38:57 DEBUG     [messenger] send PONG
11:38:57 ERROR     [messenger] Processing timeout

every 2nd/3rd message or so produces a processing timeout

at this point im not even sure if it's a server or a client issue

@ro0NL
Copy link
Contributor Author

ro0NL commented Mar 21, 2024

i tried enabling nats server debug mode on staging, but after doing so i couldnt reproduce anymore

i then explicitly disabled debug mode in prod, which also solved the matter there

image
(for Processing timeout)

so it seems a fresh nats-server rollout did the trick 🙏

sorry for the noise :')

@ro0NL ro0NL closed this as completed Mar 21, 2024
@nekufa
Copy link
Member

nekufa commented Mar 25, 2024

  1. When trying to send a new PING, if more than 2 PINGs had already been sent without first receiving a PONG reply consider the connection to be dead

@ro0NL maybe we need to implement same logic? now single ping without response means that connection is dead..

@ro0NL
Copy link
Contributor Author

ro0NL commented Mar 25, 2024

@nekufa sounds reasonable yes 👍 in our case, we are generally safe now

fetch happens in externel while-true loop, thus keep retrying anyway
for send/ack/nak i already added a single retry 😅

@nekufa nekufa added the enhancement New feature or request label Mar 25, 2024
@nekufa nekufa reopened this Mar 28, 2024
@ro0NL
Copy link
Contributor Author

ro0NL commented Apr 10, 2024

Still processing timeouts.

We run replicated NATS server in k8, and it's far from stable :')

I'm beginning to think this is more related to #22 in some way.

@ro0NL ro0NL closed this as completed Apr 10, 2024
@ro0NL
Copy link
Contributor Author

ro0NL commented Apr 10, 2024

@nekufa the issue occurs when 1/3 replicas crashes, im not sure to what state the server recovers but it then starts producing significant number of Processing timeout exceptions, even though we can connect to 2/3 replicas still

for now we've scaled down to 2 replica's and connect thru headless service, to see if it improves the matter

im still not sure we should connect to nats:4222, nats-headless:4222 or a pool of replica servers

perhaps a host definition like replica-{0,1,2}.nats:4222 can be expanded to server pool for socket to connect with

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants