What is the granularity of a source? #940

youennf · 2023-03-09T16:17:12Z

It might make the spec easier to understand if we could get consensus on what the granularity of a source is.

We know that:

at the minimum, the source is shared between cloned tracks: a getUserMedia call is the minimum granularity for a source
a UA wide source per device is not ok, device Ids are not shared between origins for instance.
A source can also be at the global object level, or at the top level global object level.

The spec in some places may assume 2 or 3.
The most flexible option is probably option 1, this was used like this in the past in some Safari cases (where multiple cameras cannot run at the same time for instance).

youennf · 2023-03-14T14:18:36Z

If we define an API to mute tracks, we will probably define as muting the source.
We probably need to define the granularity in a good enough way so that the mute API has cross browser interop.

jan-ivar · 2023-03-16T20:36:16Z

I see benefit in discussing granularity of a source irrespective of mute.

Muted is a UA-controlled property of the track, not the source.¹ Keeping it that way may be important to solve w3c/mediacapture-extensions#39, where one application with transient activation calling track.unmute() might not translate to it being OK to unmute all tracks in all circumstances.

There are privacy issues in the User Agent space here, which we've traditionally handled in the "Best Practice" section.

I think UAs are handling muting well (but differently) today, so I see no reason to further restrict them here.

_{1. The spec's mention of "state of the source" seems to exist solely to ensure consistency between clones created by getUserMedia instead of track.clone().}

jan-ivar · 2023-03-17T14:10:43Z

We know that:

at the minimum, the source is shared between cloned tracks: a getUserMedia call is the minimum granularity for a source

This would mean that track1 and track3 here would have the same source, but track2 would not?

const track1 = navigator.mediaDevices.getUserMedia({video: {deviceId: {exact: deviceId}});
const track2 = navigator.mediaDevices.getUserMedia({video: {deviceId: {exact: deviceId}});
const track3 = track1.clone();

What benefit would this distinction have? I don't think it exists in Firefox and Chrome. Does it exist in Safari?

This interpretation is NOT supported by spec IMHO, because track2.stop() would "set [[devicesLiveMap]][deviceId] to false.", turning off the privacy indicator in the URL bar prematurely while the other two tracks are still live, which fails on privacy and is not how any browser operates today.

The minimum granularity has to be at least the JS realm for the spec to work, i.e. the mediaDevices object.

this was used like this in the past in some Safari cases (where multiple cameras cannot run at the same time for instance).

Uses of multiple cameras seem different from multiple uses of the same camera. I don't see the connection.

youennf · 2023-03-20T10:34:39Z

What benefit would this distinction have? I don't think it exists in Firefox and Chrome. Does it exist in Safari?

It used to be like this in Safari. It could happen in the future.

Uses of multiple cameras seem different from multiple uses of the same camera. I don't see the connection.

Say you do the following:

const track1 = navigator.mediaDevices.getUserMedia({video: {facingMode: 'environment', width:640});
const track2 = navigator.mediaDevices.getUserMedia({video: {facingMode: 'environment', width:1280});

Everything works fine. Now you start another capture:

const track3 = navigator.mediaDevices.getUserMedia({video: {facingMode: 'user', width:640});

In that case, you might still be able to capture the environment at 640 but not at 1280.
If track1 and track2 share the same source, WebKit would probably fail both of them, but only track2 would be ended if they would not share the same source.
I guess the alternative would be to fire a configurationchange event on track2.

The minimum granularity has to be at least the JS realm for the spec to work, i.e. the mediaDevices object.

Right, I guess we could change this if there was a use case for it but it does not seem worth it.
If the source granularity was the top page, [[devicesLiveMap]] would also start to be out of sync as well.

Can we settle on source granularity being the JS realm then?
If so, it might be worth clarifying this in the spec.

jan-ivar · 2023-03-20T17:38:22Z

If track1 and track2 share the same source, WebKit would probably fail both of them,

When you say "fail" here, you mean WebKit would end active tracks because another call to getUserMedia was made?

Even so, track ended doesn't rely on source stopped. It's the other way around.

Firefox would instead fail creation of track3 FWIW. But these seem like UA decisions independent of where source is defined. IOW, I'd expect the same behavior from the following code (becase constraints are per-track, not per-source):

const track1 = navigator.mediaDevices.getUserMedia({video: {facingMode: 'environment', width:640});
const track2 = track1.clone(); await track2.applyConstraints({video: {width:1280}});

const track3 = navigator.mediaDevices.getUserMedia({video: {facingMode: 'user', width:640});

This seems like POLA, an invariant worth clarifying in the spec, so implementations don't implement side effects that could affect API usage.

If the source granularity was the top page, [[devicesLiveMap]] would also start to be out of sync as well.

True, but this seems trivially avoided by using a rawDeviceId instead of the awkward reference to MediaDeviceInfo's deviceId, which is what I believe implementations do in practice already.

Can we settle on source granularity being the JS realm then?

Can we eliminate option 1 first? I'd like to discuss the remaining options, but wanted to simplify first.

jan-ivar · 2023-03-20T20:34:45Z

While a source being UA-wide may have worked in the past, it breaks #804 so I think we're down to:

Source is per-document, or
Source is per top-level document

As the stop all sources algorithm is currently written, it only works with 1. So the archeological answer is 1.

But I'd like to propose 2.

If we look at what source/device intersects with in the spec, it's: permission, privacy indicators, and track lifetime.

Permissions integration ties device permission to the top-level document, and privacy indicators are implemented at that level too.

Regarding lifetime, I think our main concern in w3c/mediacapture-extensions#30 was also around privacy and permission. E.g. that a user revoking permission in the original web page ends transferred tracks as well, and that privacy indicators remain on the original web page's tab for as long as transferred tracks are alive.

In the old days, we didn't have to consider track lifetime outside of the document that spawned a track, but now we do. And it seems to me that tying lifetime to the top-level document might be fine.

jan-ivar mentioned this issue Apr 18, 2023

Solve user agent camera/microphone double-mute w3c/mediacapture-extensions#39

Closed

jan-ivar added this to the Revise Candidate Recommendation milestone Feb 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the granularity of a source? #940

What is the granularity of a source? #940

youennf commented Mar 9, 2023

youennf commented Mar 14, 2023

jan-ivar commented Mar 16, 2023

jan-ivar commented Mar 17, 2023

youennf commented Mar 20, 2023

jan-ivar commented Mar 20, 2023 •

edited

Loading

jan-ivar commented Mar 20, 2023

What is the granularity of a source? #940

What is the granularity of a source? #940

Comments

youennf commented Mar 9, 2023

youennf commented Mar 14, 2023

jan-ivar commented Mar 16, 2023

jan-ivar commented Mar 17, 2023

youennf commented Mar 20, 2023

jan-ivar commented Mar 20, 2023 • edited Loading

jan-ivar commented Mar 20, 2023

jan-ivar commented Mar 20, 2023 •

edited

Loading