Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
feat(NODE-6882): close outstanding connections #4499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(NODE-6882): close outstanding connections #4499
Changes from all commits
100b3b5
7bb785f
988e37e
84afaa2
11853d4
91a3103
cf6429e
f8bfe7e
ff8e814
4add48c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by the ordering of events here. Why are we explicitly closing in-use connections before we do other cleanup logic? I'd expect that this would just be handled automatically by
topology.close()
. Usingtopology.close()
also means that we don't need to add new methods to the topology + server classes, just for this use case.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this also lead to connection establishments during close? Hypothetically, if we kill all connections in a pool, killCursors or endSessions will require establishing new connections for these commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid a deadlock, since we're about to run operations (killCursors, abortTransaction, endSessions) we can deadlock if the pool is at maximum use (ex. 100 checked out connections). First closing in-use connections means we have a pool that is ready for use so we can proceed to clean up operations.
Maybe this was answered by the previous paragraph but to be clear we still need an open and operating topology/servers/pool/connections to be able to perform best-effort clean-up.
Yes, but this is not new, a MongoClient can already have empty pools when we call MongoClient close running clean up operations will create new connections. Even thought we're closing in-use operations that doesn't mean that we will need to make a new connection, the pool can still have idle connections that won't be closed by this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternate approach: since cleanup is best-effort, could we instead just attempt the commands we need to send with a low waitQueueTimeoutMS set to set an upper bound on how long we wait in case of a deadlock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same proposal could be made for the
killCursors
run insidecursor.close()
(not in here,client.close()
, but normally by the user).But I think the reason that we do "risk" the full waitQueueTimeoutMS and serverSelectionTimeoutMS is because I don't think we can predict good timeout values that will fit some common denominator given the driver is general purpose and deployed in many ways, so we don't have a common case to approach (ex. the timing for mongosh is vastly different from an edge server app).
Closing in-use connection separately is specifically so our connection count isn't at
maxPoolSize
when we reach the next steps.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test demonstrating that closing connections pre-killCursors prevents the deadlock you mentioned? Probably worth capturing in a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test, and I checked if I comment out closeCheckedOutConnections then it fails