Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve broker initiated client disconnects #816

Merged

Conversation

spuun
Copy link
Member

@spuun spuun commented Oct 25, 2024

WHAT is this pull request doing?

Writing to client sockets may block a fiber because the buffers are full. When this happens this fiber will hold the write lock in Client until it manage to write the data. Because if this it's not possible to force close a connection from the management UI, and the http request will timeout. It will also affect lavinmq shutdown because VHost#close will be blocked from closing all connections effectively.

This PR will:

  • add a timeout argument to #close that will set a write_timeout on the socket.
  • spawn fibers to perform "async" disconnects in parallel on shutdown

HOW can this pull request be tested?

Manually with a "bad" client, e.g. unlimited prefetch and no acking. Let it consume until LavinMQ stops because of tcp back pressue. Try kill the connection.

@spuun spuun force-pushed the bugfix-force-close-socket-on-broker-initiated-connection-close branch from cffea6d to f8a0fca Compare October 25, 2024 13:33
src/lavinmq/amqp/client.cr Outdated Show resolved Hide resolved
src/lavinmq/amqp/client.cr Outdated Show resolved Hide resolved
@spuun spuun marked this pull request as ready for review October 28, 2024 08:56
@spuun spuun requested a review from a team as a code owner October 28, 2024 08:56
src/lavinmq/amqp/client.cr Outdated Show resolved Hide resolved
src/lavinmq/amqp/client.cr Outdated Show resolved Hide resolved
src/lavinmq/amqp/client.cr Outdated Show resolved Hide resolved
@spuun
Copy link
Member Author

spuun commented Oct 29, 2024

One thing we don't get now is the "async" close of ALL connections on shutdown. If there are multiple "zombies" on shutdown we'll have to wait n * timeout (n number of zombies).

@carlhoerberg
Copy link
Member

Yes, but im thinking we can have that logic elsewhere, like push all connections to a Channel and then let a (limited) amount of fibers to work it off until it's empty.

@spuun spuun force-pushed the bugfix-force-close-socket-on-broker-initiated-connection-close branch from 57efbec to 7961ff6 Compare November 6, 2024 19:40
@spuun spuun changed the title Add timeout to Client#close to force close socket Improve broker initiated client disconnects Nov 7, 2024
@spuun
Copy link
Member Author

spuun commented Nov 7, 2024

Updated the title and description. I've added the solution with fibers and a channel to disconnects clients "async" on shutdown.

src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
src/lavinmq/http/controller/connections.cr Outdated Show resolved Hide resolved
src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
else
fiber_id = fiber_count &+= 1
@log.trace { "spawning close conn fiber #{fiber_id} " }
wg.spawn do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make a PR to the stdlib so that we can name fibers spawned by WaitGroup#spawn.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR sent.

end
@log.trace { "exiting close conn fiber #{fiber_id} " }
end
Fiber.yield
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required? Doesn't Channel#send on a unbuffered channel automatically yield the current fiber?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why, but it did spawn one extra fiber. Not a big deal, but...

src/lavinmq/vhost.cr Outdated Show resolved Hide resolved
@spuun spuun merged commit 719c3e6 into main Nov 13, 2024
21 of 25 checks passed
@spuun spuun deleted the bugfix-force-close-socket-on-broker-initiated-connection-close branch November 13, 2024 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants