EncodedChunks confusingly provide access to constructed ".data" -- despite the decoder making an internal copy. #80

dalecurtis · 2020-09-22T20:03:52Z

As folks are starting to play with the API, at least one person has hit a footgun where they modified the encoded chunk after sending it to the decoder. We should just copy before control returns to JS to avoid this.

Obviously this prevents a zero copy on encoded data path, but generally copying the input data isn't expensive.

koush · 2020-09-22T20:08:34Z

I am that footgun user.
In my head, EncodedVideoChunk seemed like a simple dictionary object (which is why I modified post creation), but I guess more is happening under the hood there.

This was before I had thought the data had been passed to the decoder (via the decode) call: simply constructing the EncodedVideoChunk object with values has side effects.

sandersdan · 2020-09-22T21:22:24Z

There is potential confusion either way. One solution is to be more verbose; instead of an EncodedChunk constructor we could have a static EncodedChunk::CopyFrom(). We can also remove the .data member and provide a copyInto() method (like Plane has), and that would remove a copy in the Chrome implementation.

It does mean that JS codec implementations would have to copy their chunks at least once, but that is likely worth the cost.

tidoust · 2020-09-23T07:48:39Z

Regarding memory copies, I note that ongoing discussions in the workshop on Machine Learning on the Web suggest that, in the case of ML inference applied to media frames, memory copies throughout the processing pipeline (fetch, decoding, processing, rendering) may take as much time as the ML inference itself. See the Memory copies issue for details.

I'm proposing a dedicated virtual breakout session during TPAC breakout week (26-30 October) on Memory copies & zero-copy operations on the Web, so that we can better understand when copies are needed, expensive/inexpensive, etc.

dalecurtis · 2020-09-23T16:35:18Z

Thanks. The one API that would help us avoid this is an immutable ArrayBuffer.

Note: We're only discussing copies of encoded data here, not the raw frames. So in real-time cases I wouldn't expect these copies to go over ~15MBit/s. The copies may slow down faster-than-realtime encoding, but I would still expect them to be smaller than the raw frames used by ML.

koush · 2020-09-23T17:20:54Z

This was part of the reason I managed to run into this bug: I assumed the underlying pipeline would be zero copy (as that's something I strive towards for performance reasons). Ie, the ArrayBuffer I pass in belongs to the decoder until the call/promise returns.

Similar to Android's MediaCodec's management of input/output buffers that are queued/dequeued.

chcunningham · 2020-12-04T23:23:22Z

See related issue #104.

One solution is to be more verbose; instead of an EncodedChunk constructor we could have a static EncodedChunk::CopyFrom()

Right now we have 2 copies.

at creation of EncodedChunk
sometime after decode() is called (when the decode is actually processed)

#104 is about eliminating copy number 2. But what if we eliminate both. Rather than EncodedChunk::CopyFrom(), we could define EncodedChunk::TransferTo() (naming needs work), which would transfer the underlying bytes from input buffer, leaving neutering the callers reference.

sandersdan · 2020-12-04T23:40:13Z

General note: all of our APIs that take ArrayBuffers should probably have a transfer flag or transferList.

Either way we do it, one of the two operations will be eager:

Encoder implementation outputs EncodedChunk: ideally we would not yet create an ArrayBuffer for the data.
Demuxer creates EncodedChunk: we should transfer an ArrayBuffer to avoid copying. (Note that there may be cases where copying is actually better, EncodedChunks are often small.)

If we can avoid exposing any implementation details of EncodedChunk then it could be generic over different backings. (That would be ArrayBuffer and DecoderBuffer in Chrome.)

Provides a readInto() method for copying the data out. Fixes #80. Also, see #127.

padenot · 2021-05-05T11:39:20Z

General note: all of our APIs that take ArrayBuffers should probably have a transfer flag or transferList.

#104, but also #212.

chcunningham · 2021-05-12T06:57:15Z

Triage note: 'breaking', as we are removing an attribute (replaced by a method).

dalecurtis changed the title ~~Decoder implementations should always copy the encoded chunks before returning control to JS.~~ EncodedChunks confusingly provide access to constructed ".data" -- despite the decoder making an internal copy. Sep 22, 2020

chcunningham added this to the Q4 2020 milestone Dec 4, 2020

chcunningham self-assigned this Jan 21, 2021

chcunningham modified the milestones: Q4 2020, Q1 2021 Jan 21, 2021

chcunningham mentioned this issue Feb 18, 2021

Should the spec require copying init.data when constructing EncodedAudioChunk and EncodedVideoChunk? #127

Closed

chcunningham added a commit that referenced this issue Apr 9, 2021

Make Encoded*Chunk interfaces immutable.

5b1fb6e

Provides a readInto() method for copying the data out. Fixes #80. Also, see #127.

chcunningham mentioned this issue Apr 9, 2021

Make Encoded*Chunk interfaces immutable. #174

Merged

chcunningham mentioned this issue Apr 20, 2021

Encoded frame IDLs share definitions with WebCodecs w3c/webrtc-encoded-transform#99

Open

chcunningham added the breaking Interface changes that would break current usage (producing errors or undesired behavior). label May 12, 2021

chcunningham closed this as completed in #174 May 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EncodedChunks confusingly provide access to constructed ".data" -- despite the decoder making an internal copy. #80

EncodedChunks confusingly provide access to constructed ".data" -- despite the decoder making an internal copy. #80

dalecurtis commented Sep 22, 2020

koush commented Sep 22, 2020 •

edited

Loading

sandersdan commented Sep 22, 2020 •

edited

Loading

tidoust commented Sep 23, 2020

dalecurtis commented Sep 23, 2020

koush commented Sep 23, 2020 •

edited

Loading

chcunningham commented Dec 4, 2020

sandersdan commented Dec 4, 2020

padenot commented May 5, 2021

chcunningham commented May 12, 2021

EncodedChunks confusingly provide access to constructed ".data" -- despite the decoder making an internal copy. #80

EncodedChunks confusingly provide access to constructed ".data" -- despite the decoder making an internal copy. #80

Comments

dalecurtis commented Sep 22, 2020

koush commented Sep 22, 2020 • edited Loading

sandersdan commented Sep 22, 2020 • edited Loading

tidoust commented Sep 23, 2020

dalecurtis commented Sep 23, 2020

koush commented Sep 23, 2020 • edited Loading

chcunningham commented Dec 4, 2020

sandersdan commented Dec 4, 2020

padenot commented May 5, 2021

chcunningham commented May 12, 2021

koush commented Sep 22, 2020 •

edited

Loading

sandersdan commented Sep 22, 2020 •

edited

Loading

koush commented Sep 23, 2020 •

edited

Loading