Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API to control encode complexity #191

Open
ssilkin opened this issue Dec 11, 2023 · 10 comments
Open

Add API to control encode complexity #191

ssilkin opened this issue Dec 11, 2023 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@ssilkin
Copy link

ssilkin commented Dec 11, 2023

Background

Encode complexity settings are hardcoded for WebRTC's built-in encoders (libvpx VP8/VP9, libaom AV1 and OpenH264). The settings depend on platform, number of CPU cores and video resolution and are optimized to provide acceptable performance on a wide range of devices. In some scenarios these default settings are suboptimal. Access to encode complexity settings would allow applications to optimize the trade-off between device resource usage and compression efficiency for their use cases. For example, a higher encode complexity mode can be used to achieve better video quality and/or to reduce video bitrate.

Proposed API

Add encodeComplexityMode to RTCRtpEncodingParameters:

enum RTCEncodeComplexityMode {
  "low",
  "normal",
  "high"
};

partial dictionary RTCRtpEncodingParameters {
  RTCEncodeComplexityMode encodeComplexityMode = "normal";
};

encodeComplexityMode specifies the encoding complexity mode. "normal" is the default mode that provides a balance between device resource usage and compression efficiency suitable for most use cases. Relative to "normal" mode:

  • "low" mode results in lower device resource usage and worse compression efficiency
  • "high" mode results in higher device resource usage and better compression efficiency

The user agent SHOULD configure the encoder according to the encoding complexity mode specified. Changes in encoding performance are codec specific and are not guaranteed.

Details

A hardcoded mapping will be used to convert complexity mode to encoding settings (CPUUSED in the case of SW libaom/libvpx encoders, KEY_COMPLEXITY in the case of Android MediaCodec, etc).

Relative differences in encoding performance between different encode complexity modes are not fixed and may change in the new binary due to changes in underlying encoders and/or compilation settings.

@Orphis Orphis self-assigned this Dec 11, 2023
@Orphis Orphis added the enhancement New feature or request label Dec 11, 2023
@Orphis
Copy link
Contributor

Orphis commented Dec 14, 2023

@aboba Can we add this to the grab bag in the January interim please?

@fippo
Copy link
Contributor

fippo commented Jan 16, 2024

This also maps nicely to the Opus "complexity" (0-10 OPUS_SET_COMPLEXITY(x) / OPUS_SET_COMPLEXITY_REQUEST) concept where the "normal" chosen by most applications is a value of 9, with 5 being the "low" default on mobile platforms.

@youennf
Copy link
Contributor

youennf commented Jan 16, 2024

From the discussion, it seems we want to do two things:

  1. Be able to say this stream that we encode is less important than this other one. This would allow UA to fine tune their degradation heuristics.
  2. Tell the UA that we prefer high quality, or battery life. And then UA might want to tweak its encoder.

For 1, we already have https://w3c.github.io/webrtc-priority/#rtc-priority-type so maybe we do not need anything?
For 2, I wonder whether this should not be a global setting at the scope of the peer connection.
Maybe a preference with values like "quality" and "powerEfficiency" (plus "") would be good enough.

Note also, that, if CPU is what can be controlled, this could also be applied to the receiving side, which could reduce somehow its complexity by changing the rendering (longer audio chunks, dropping frame rate to 30fps...).

@Orphis
Copy link
Contributor

Orphis commented Jan 17, 2024

While it could be used as a way to prioritize resources when you have multiple streams, it's not just that as you may decide that a single stream still needs the setting.

And applying it to the whole page doesn't work either as it may need to be dynamic during the application lifetime and I don't see how a global setting will be an effective API for that.

Another use case is when the application detects a bad network quality and that resources are available, it may ask for more resources to be spent on encoding some important media stream. While the user-agent can try to do some of that automatically, it is unreasonable to expect it to go past some threshold that could be negatively impacting some metrics as it's usually a trade-off.

The application may decide that in some circumstances, this is ok and this would be the setting to allow that. But you need to have local knowledge of the application and usage in order to turn it in and I believe a user-agent to be too high level to be able to infer it all.

Also, FYI, I'd expect this setting to be mapped to VideoToolbox's PrioritizeEncodingSpeedOverQuality setting and maybe QP settings like MinAllowedFrameQP and MaxAllowedFrameQP.

@youennf
Copy link
Contributor

youennf commented Jan 18, 2024

QP settings like MinAllowedFrameQP and MaxAllowedFrameQP.

These settings will have an impact on bitrate, I am not sure it will have any effect on CPU.

PrioritizeEncodingSpeedOverQuality setting

I am not sure this one kicks in in the low latency code path.
Overall, when hardware encoders are in use, CPU usage will probably not vary much.
I do not know how is mapped KEY_COMPLEXITY to HW realtime android encoders and what the impact is.
The OP is mentioning SW encoders, is this the target?

the application detects a bad network quality and that resources are available

I am not exactly sure to understand how the application is supposed to know that resources are available and I am questioning how we can get interop between UAs.
For instance, CPU adaptation could be done either with the encoder complexity mode or with media adaptation, ditto for bitrate.

The following might somehow work for video:

  • web site sets encoding complexity, UA translates it to a fixed value.
  • When UA hits bitrate or CPU contention, UA is not expected to change video encoder's complexity mode. Instead it relies on the current degradation preference by either reducing frame rate or resolution.
  • web site may further tweak encoding complexity based on observed sent frame rate / resolution / bitrate.

Is this how this is supposed to be implemented and used?

With regards to audio, it cannot really do media adaptation, I am unclear how it would know CPU adaptation is needed.

@ssilkin
Copy link
Author

ssilkin commented Sep 17, 2024

@aboba , Bernard, would it be possible to reserve a time slot to discuss this API at the next WebRTC WG meeting?

@aboba
Copy link
Contributor

aboba commented Sep 17, 2024

@ssilkin If you are attending TPAC 2024, perhaps we can add discussion of this issue to Erik Sprang's slot in the joint MEDIA/WEBRTC WG meeting. If he agrees, you can add the slide(s) here.

@sprangerik
Copy link

I can talk about this issue. I took the freedom to move it from the join webrtc/media wg session to the webrtc wg session on the 24th (taking some minutes from the stats I'll be talking about there). Hope that's ok!

@aboba
Copy link
Contributor

aboba commented Sep 19, 2024

No problem.

@ssilkin
Copy link
Author

ssilkin commented Sep 23, 2024

Thanks, @sprangerik!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants