Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avert the race between sending data and sending EOF #222

Merged
merged 3 commits into from
Jan 18, 2024

Conversation

mmirate
Copy link
Contributor

@mmirate mmirate commented Dec 18, 2023

This fixes #213. Also, while verifying as such, I noticed that an unrelated modification to client_exec_simple was required for correct operation, and have included it here. Namely: the receiving event-loop should not terminate early upon receiving the exit status, since SSH servers can and will send data after sending the exit status.

Also, I did something similar for the close operation.

Also, my initial attempt at this involved generalizing the pending_data queue so that it could contain EOF packets. That doesn't work at all, but I left in-place a refactor which should stop outgoing extended data from being transformed into regular data when placed into the pending deque.

In theory, an extra flag to restore the previous behavior might be desirable in case the previous behavior happens to be desirable; but that would be a not-strictly-necessary breaking change to the public API.

I see one downside. Consider the usecase - albeit very odd imho - where a channel has multiple owners that all constantly send a large volume of data (such that the pending deque does not empty itself as long as the sending continues) and do not cooperate among themselves as to when the channel is done being sent data. In this scenario, the status quo means that the first owner who wants an EOF will indeed cause the EOF to be sent, promptly but imprecisely. Whereas with this PR's change, the EOF might not be sent until the owners unanimously agree to cease sending data. The only way to tolerate this that I can fathom, would be to add some way to notify the user that a given channel's pending data buffer has been completely flushed - owners of singly-owned channels who have scheduled all outgoing data could await that message before sending EOF, and owners of multiply-owned channels would remain free to send EOF as soon as they please.

@Eugeny
Copy link
Owner

Eugeny commented Dec 22, 2023

Do you think it would make sense in this context to simply refuse sending more data once pending_eof has been set? This is in line with the protocol spec and avoids the case you're describing. There'll be a breaking change since .data() would have to return a Result but that's fine IMO.

@mmirate
Copy link
Contributor Author

mmirate commented Dec 27, 2023

Do you think it would make sense in this context to simply refuse sending more data once pending_eof has been set? This is in line with the protocol spec and avoids the case you're describing. There'll be a breaking change since .data() would have to return a Result but that's fine IMO.

That would be ideal if there was a way to raise that error at Channel::data (e.g. by having some way for a Channel to have an eventually-consistent view of the pending_eof flag of the Session to which it is connected, such that the Channel alone would be responsible for checking pending_eof and raising an error before writing anything to the mpsc channel). Since the underlying communication from the Channel to the Session is an mpsc channel, any call to Channel::data who ends up perpetrating an error at {client,server}::Session::data cannot actually receive that error and may have already returned to its caller by the time the error happens; so what would happen instead is, Channel::data's signature would remain unchanged, {client,server}::Session::data would become fallible, and if data is sent via Channel::data while EOF is pending, the error will be raised at the main loop, cutting the entire connection. I don't think that that behavior would make sense.

@Eugeny
Copy link
Owner

Eugeny commented Dec 28, 2023

What if we let ChannelParams/Channel share pending_eof through a Mutex<Arc> similarly to how ChannelTx instances already share their window_sizes?

There's still one edge case left where the user calls .eof() and then .data() again while the EOF message is still queued in the mpsc channel, but this behaviour is against the spec anyway and that data could be simply discarded by the Session IMO.

@mmirate
Copy link
Contributor Author

mmirate commented Jan 8, 2024

What if we let ChannelParams/Channel share pending_eof through a Mutex<Arc>

Since this is a boolean and the edge-case in question would still be handled properly if a new value of the flag arrived a little late, I believe that an Arc<AtomicBool>, loaded and stored with Relaxed ordering, would probably suffice.

The tricky part is where to do the stores.

similarly to how ChannelTx instances already share their window_sizes?

If I'm understanding the code architecture here correctly, the ChannelParams maintains its own accounting of the window size, not connected to the ChannelRef's mutex but rather resynchronized whenever the peer (or a ChannelRef owner) requests an adjustment. The ChannelParams are stored on the Encrypted, but the ChannelRef that holds the mutex with which it is resynchronized, is available only on the Session, not on the Encrypted which has public methods which ought to set pending_eof if pending_data is nonempty. This makes it unclear how to proceed.

@Eugeny Eugeny merged commit 54595f3 into Eugeny:main Jan 18, 2024
4 checks passed
@mmirate mmirate deleted the avert-the-datarace branch January 8, 2025 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

channel.data drops end of data
2 participants