Expose RTCEncodedAudioFrame interface in AudioWorklets #226

tonyherre · 2024-03-08T09:38:31Z

Working with encoded frames from worklets, particularly RTCEncodedAudioFrames from AudioWorklets, would be very useful for apps, allowing them to choose the best execution environment for encoded media processing, beyond just Window and DedicatedWorker.

Readable and WritableStreams already have Exposed=Worklet, so transferring the streams of encoded frames would make sense and allow more performant implementations than eg requiring apps to copy data & metadata in a DedicatedWorker before going to the worklet / after returning from it.

I propose we add Worklet to the Exposed lists for RTCEncodedVideoFrame and RTCEncodedAudioFrame, and likely follow up with similar changes for the interfaces in the webcodecs spec.

CC @alvestrand @aboba @guidou @youennf

The text was updated successfully, but these errors were encountered:

youennf · 2024-03-08T11:31:45Z

In typical WebRTC applications, there is a thread for audio capture/rendering and there are threads dedicated to networking, media handling.
The former (which maps to AudioWorklet) is usually higher priority than the latter (which map to DedicatedWorkers).

I am not sure allowing to do encoding/networking in AudioWorklet is a good idea.
For instance webcodecs construct are not supported in worklets.
WebRTC encoded transform streams are transferable but RTCRtpScriptTransformer is not.

CC @padenot and @jan-ivar.

padenot · 2024-03-08T12:24:54Z

For video, just no.

Decoding audio on real-time threads has been seen in very specific scenarios, and can be done safely, if the decoder is real-time safe, etc. all the usual stuff. Encoding, I've never seen it and I don't really know how useful it would be or what it would bring.

The Web Codecs API however won't work well (or at all) in AudioWorklet, because the AudioWorklet is inherently and by necessity a synchronous environment, and the Web Codecs API an asynchronous API. I had proposed a synchronous API for Web Codecs, and explained why (in w3c/webcodecs#19), but we haven't done it.

I side with @youennf on this. Communicating with an AudioWorkletProcessor is not hard and can easily be done extremely efficiently. Any claim of being able to do a "more performant" implementation need to be backed by something.

Once we have apps for which the limiting factor is the packetization latency or something in that area, we can revisit.

tonyherre · 2024-03-08T12:31:06Z

WebRTC encoded transform streams are transferable but RTCRtpScriptTransformer is not.

The RTCRtpScriptTransformer.readable is transferable, so could be posted to a worklet within the current shape.

The former usecase @youennf mentioned - decoding+rendering Audio - is indeed the one I'm interested in getting on a worklet. IIUC libwebrtc does its audio decoding in the realtime thread, just before rendering, so that concept isn't all that wild.

Any claim of being able to do a "more performant" implementation need to be backed by something.

Transferring the readablestream to the worklet would mean frames could be delivered directly there. Requiring JS work to be done elsewhere first would necessitate visiting another JS thread, scheduling an event there etc, so ~double the overhead plus the cost of allocating the intermediate objects to be re-transferred. I can see if I can get some more concrete numbers, but there's clearly additional work needed to be done by app+UA which could be skipped with this.

jan-ivar · 2024-03-08T23:02:58Z

The RTCRtpScriptTransformer.readable is transferable, so could be posted to a worklet within the current shape.

This produces a "readable side in another realm", where the original realm feeding that readable is the dedicated worker provided by the webpage, at least in current implementations of that surface.

requiring apps to copy data & metadata in a DedicatedWorker before going to the worklet / after returning from it.

Can't the app just transfer the frame.data ArrayBuffer instead? i.e. not a copy. It'd be interesting to see the numbers.

Transferring the readablestream to the worklet would mean frames could be delivered directly there.

This sounds like whatwg/streams#1124.

jan-ivar · 2024-03-08T23:04:23Z

IIUC libwebrtc does its audio decoding in the realtime thread, just before rendering, so that concept isn't all that wild.

I wasn't aware. Is this a special-case for element.srcObject = new MediaStream([transceiver.receiver.track]);?

padenot · 2024-03-11T10:18:40Z

I wasn't aware. Is this a special-case for element.srcObject = new MediaStream([transceiver.receiver.track]);?

It's an implementation concern, script doesn't know about it.

Orphis · 2024-05-02T13:17:59Z

Would you be open to just reframe the issue to exposing RTCEncodedAudioFrame to an AudioWorklet context?

padenot · 2024-05-02T13:36:09Z

If we want to do decoding in real-time threads, in the AudioWorkletGlobalScope, here are some rough steps:

Adding a way to get the encoded audio packet from a RTCEncodedAudioFrame, without allocations, and in a real-time-safe manner
Adding a sync decoding interface to Web Codecs, that doesn't use js objects (just buffer in, buffer out, no allocs, etc.), so it's real-time safe
Exposing this to the AudioWorkletGlobalScope
Adding some tranfering capabilities to Web Codec objects so that one could set-up a decoder outside the real-time thread, and send it over

Orphis · 2024-05-02T13:41:22Z

I don't think this should necessarily be tied to WebCodecs. While they can be useful in some cases, they are not going to be covering all use cases or experimental codec work that is inherently implemented in JS / WASM.

padenot · 2024-05-02T14:02:23Z

If you're not using Web Codecs, there's not benefit to exposing RTCEncodedAudioFrame. Just extract the data into a buffer and communicate to the AudioWorkletGlobalScope. This can be done today using postMessage or SharedArrayBuffer.

youennf · 2024-05-02T14:25:18Z

Instead of transferring stream to worklets, the alternative would be to let script transform take a worklet instead of a worker.
That is probably the most straightforward approach, but still needs discussing in terms of scenarios and pros/cons.

Orphis · 2024-05-15T13:00:13Z

@aboba Can we add this to the agenda for next week's interim?

dontcallmedom-bot · 2024-05-22T09:10:36Z

This issue was discussed in WebRTC Interim, May 21st – 21 May 2024 (Issue 226: Expose RTCEncodedAudioFrame interface in Worklets)

tonyherre changed the title ~~Expose RTCEncoded*Frame interfaces in Worklets~~ Expose RTCEncodedAudioFrame interface in AudioWorklets Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose RTCEncodedAudioFrame interface in AudioWorklets #226

Expose RTCEncodedAudioFrame interface in AudioWorklets #226

tonyherre commented Mar 8, 2024

youennf commented Mar 8, 2024

padenot commented Mar 8, 2024

tonyherre commented Mar 8, 2024 •

edited

Loading

jan-ivar commented Mar 8, 2024 •

edited

Loading

jan-ivar commented Mar 8, 2024 •

edited

Loading

padenot commented Mar 11, 2024

Orphis commented May 2, 2024

padenot commented May 2, 2024

Orphis commented May 2, 2024

padenot commented May 2, 2024

youennf commented May 2, 2024

Orphis commented May 15, 2024

dontcallmedom-bot commented May 22, 2024

Expose RTCEncodedAudioFrame interface in AudioWorklets #226

Expose RTCEncodedAudioFrame interface in AudioWorklets #226

Comments

tonyherre commented Mar 8, 2024

youennf commented Mar 8, 2024

padenot commented Mar 8, 2024

tonyherre commented Mar 8, 2024 • edited Loading

jan-ivar commented Mar 8, 2024 • edited Loading

jan-ivar commented Mar 8, 2024 • edited Loading

padenot commented Mar 11, 2024

Orphis commented May 2, 2024

padenot commented May 2, 2024

Orphis commented May 2, 2024

padenot commented May 2, 2024

youennf commented May 2, 2024

Orphis commented May 15, 2024

dontcallmedom-bot commented May 22, 2024

tonyherre commented Mar 8, 2024 •

edited

Loading

jan-ivar commented Mar 8, 2024 •

edited

Loading

jan-ivar commented Mar 8, 2024 •

edited

Loading