Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webtransport: add message framing to allow graceful stream closing #606

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

achingbrain
Copy link
Member

When closing streams some implementations do not guarantee all data has been sent before closing the stream.

If this sounds familiar it's because we observed exactly the same behaviour with WebRTC.

This PR adds opt-in message framing to WebTransport streams that lets us introduce a similar FIN/FIN_ACK mechanism that guarentees the remote has received all data sent on a stream before we close it.

It uses a Noise extension to signal to the remote that we will be framing all outgoing messages, with a recommendation that the framing is omitted if the remote does not signal the same in response.

This is an attempt to make this a backwards-compatible change which will be a lot less disruptive, even if the status quo is a lot more unsafe.

We're still waiting a definitive answer from the Chromium team as to whether the data loss on closing is a bug or a feature but I thought I'd open this early so we can move quickly if they confirm it's an oversight.

When closing streams some implementations [do not guarantee all data has been sent](https://issues.chromium.org/issues/326887753) before closing the stream.

If this sounds familiar it's because we observed [exactly the same behaviour with WebRTC](https://issues.chromium.org/issues/40072842#comment5).

This PR adds opt-in message framing to WebTransport streams that
lets us introduce a similar `FIN`/`FIN_ACK` mechanism that guarentees
the remote has received all data sent on a stream before we close it.

It uses a Noise extension to signal to the remote that we will be
framing all outgoing messages, with a recommendation that the framing
is omitted if the remote does not signal the same in response.

This is an attempt to make this a backwards-compatible change which
will be a lot less disruptive, even if the status quo is a lot more
unsafe.

We're still waiting a definitive answer from the Chromium team as to
whether the data loss on closing is a bug or a feature but I thought
I'd open this early so we can move quickly if they confirm it's an
oversight.
@achingbrain achingbrain requested review from MarcoPolo and sukunrt March 5, 2024 14:58
@achingbrain
Copy link
Member Author

Opened as a draft pending a response from the Chromium team.

message Message {
enum Flag {
// The sender will no longer send messages on the stream.
FIN = 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is an issue, but the flag field is optional and the 0 value starts closing the stream.

If you were to write a data message following proto3 semantics which let you omit the flag field, then read the message using proto2 semantics, the flag field would get initialised to it's default value of 0 and the receiver would think the sender wanted to close the stream.

We can work around this by having a NIL flag as the 0 value, which would be ignored by any receiver when read by a proto2 or proto3 decoder.

Or we expect implementations to apply proto3 semantics because that's what the definition here says.

@MarcoPolo
Copy link
Contributor

MarcoPolo commented Mar 5, 2024

This must be a bug in the Chromium implementation. The W3C spec says that the WebTransportSendStream implements WritableStream which flushes all queued data on .close(). If the writer intends to discard queued data they should use .abort() instead.

I would prefer this fixed in Chromium instead of changing the spec.

What does the Firefox implementation do?

@mxinden
Copy link
Member

mxinden commented Mar 5, 2024

What does the Firefox implementation do?

Maybe helpful as an entrypoint:

https://github.com/mozilla/neqo/blob/a56e092cedad14039a8ff1e72155f099dfa489c0/neqo-http3/src/connection.rs#L1095-L1112

@achingbrain
Copy link
Member Author

I would prefer this fixed in Chromium instead of changing the spec.

I completely agree, and it's why this PR is in draft pending the outcome of the related Chromium issue.

Unfortunately, what we've learnt from the similar issue with Chromium's WebRTC implementation is that even if they agree that it's a bug, there's no guarantee it'll be fixed in a timely fashion or at all, so the pragmatic thing to do is to work around it and move on.

The nice thing here (I hope) is that the proposed solution is opt-in and implementations will not frame messages if the remote doesn't agree to do it too, so if this ever gets fixed in Chromium, once enough people have upgraded their browsers we can just stop sending the webtransport_message_framing flag and deprecate it in the spec.

What does the Firefox implementation do?

A good question and one that's been hard to answer observationally.

Support for the serverCertificateHashes option has only just made it into a nightly build and doesn't appear to work with either go or js WebTransport servers just yet so some more investigation is needed there.

@marten-seemann
Copy link
Contributor

I don’t share your pessimism regarding getting this bug fixed in Chrome. Other than WebRTC, WebTransport is still under active development. Also keep in mind that there will be WebTransport use cases (e.g. MoQ) that will probably surface this bug.

@achingbrain
Copy link
Member Author

Well, we'll see. It usually pays to plan for the worst and hope for the best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

4 participants