Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission API for receive-only media and data use cases #2175

Closed
wants to merge 2 commits into from

Conversation

lgrahl
Copy link
Contributor

@lgrahl lgrahl commented Apr 17, 2019

Background

There are use cases that do not use getUserMedia, namely unidirectional audio/video or data only use cases. For example, security cameras, baby monitors, town-hall sessions, drones, MOOCs, remote device access, easy file transfer, multiplayer games, etc. all of which can greatly benefit from mode 1. In some cases, these may be connected via a separate interface that is not the default interface. Furthermore, for data only use cases, a relayed connection can be much more restraining than for real-time audio/video because of the potential impact on throughput.

Direct Connection API

This PR proposes to add Direct Connection API to the spec. If accepted, user agents could safely ship more strict default modes without patronising the before mentioned use cases.

With help from @jan-ivar, this API has been tailored in a way to encourage permission being granted without having to rely on a prompt. However, it does also enable implementations to use a permission prompt if desired.

Intended Usage

The intent is that an application should always provide context when using the permission request as outlined in the examples that have been added.

Implementation Suggestions

Switch in the site's information panel

To not bother the user with a permission prompt, the browsers could integrate the permission request in form of a switch in the site's information panel, next to the URL - analogue to the autoplay feature in Firefox:

direct-connection-permission-switch

Depending on the user's preferences, selectable options could for example be Yes and Hide VPN addresses.

Prompt

Browsers could also integrate the changes proposed in this PR with the following permission request being triggered once registerDirectConnectionInterest is being called:

Will you allow example.org to establish a direct connection?
[Learn more]

[Yes |v]
:Yes   :
:No    :

[x] Remember the decision for this page

[Don't allow] [Allow]

If one clicks on Learn more, it could open a separate page that explains the impact on privacy regarding the exposal of all interface's addresses.

However, the user should probably not be able to demote the mode in the permission request below the one that is already active.

Why not as an extension spec?

This PR clarifies when IP handling modes are being applied, when they are effective and how a mode change is being signalled to the application. Furthermore, media receive only use cases and data channels are part of WebRTC 1.0. As such, they should have an equal opportunity to leverage the full potential of WebRTC, just like send/receive audio/video use cases. This allows to solve the privilege escalation of getUserMedia.

W3C Permissions Spec Update

Spec: https://lgrahl.github.io/permissions/#direct-connection
Diff: w3c/permissions@master...lgrahl:direct-connection

Notes

The use of the Permissions API allows to re-use the "direct-connection" permission in other specs (and potentially in other ways).

As a nice side-effect, browsers that implicitly grant the "direct-connection" permission when granting "camera" or "microphone" permissions, would now also trigger the connectionupgradable event in case the mode has been upgraded.

To do

  • Go through the spec and update parts that currently refer to which candidates may be exposed under which circumstances.
  • Line breaks.

Resolves #2012


Preview | Diff

@lgrahl lgrahl force-pushed the direct-connection-permission-4th branch from d2af6c3 to f324d75 Compare April 17, 2019 12:41
@lgrahl
Copy link
Contributor Author

lgrahl commented Apr 17, 2019

Looks like there is an issue with Travis CI. Could you investigate, @dontcallmedom?

@dontcallmedom
Copy link
Member

@lgrahl the problem with Travis should be fixed (but the pull request would need to be rebased on the latest master to show that)

@lgrahl lgrahl force-pushed the direct-connection-permission-4th branch from 4c7c0d5 to f844f00 Compare April 25, 2019 17:31
@lgrahl lgrahl force-pushed the direct-connection-permission-4th branch from f844f00 to 4a36bd6 Compare April 25, 2019 17:33
@lgrahl
Copy link
Contributor Author

lgrahl commented Apr 25, 2019

Cheers! Rebased against master.

@jan-ivar
Copy link
Member

jan-ivar commented May 9, 2019

I think this proposal addresses a problem we've had for a long time: the unfortunate tight coupling of initial connection with getUserMedia.

Right now, WebRTC connections are effectively behind permission prompts. Every major WebRTC site gets camera and mic first, when there's really no (other) reason connection couldn't have been established ahead of this, for an overall speedier connection experience. I even see code blocking createAnswer on getUserMedia, which design-wise is hacky, actually slows down initial connection, and leads people down the wrong path, away from healthy patterns like negotiationneeded.

Now, UX is hard. This is a second-tier category of permission, not good for a modal prompt.

But defining the "direct-connection" permission here is the key and the right step IMHO. It opens up ideas like e.g. have gamer web extensions control this, or—for browsers that don't implicitly persist gum permission after first use—implementations might want to automatically enable this permission for sites that have been granted getUserMedia once in the past.

I also like the API design, which purposely encourages apps to go ahead and not block on permission, and lets browsers "upgrade" the connection later.

Ironically, adding this permission might actually help free initial connection from being behind a permission prompt.

@youennf
Copy link
Contributor

youennf commented Jun 7, 2019

There are use cases that do not use getUserMedia, namely unidirectional audio/video or data only use cases. For example, security cameras, baby monitors, town-hall sessions, drones, MOOCs, remote device access, easy file transfer, multiplayer games, etc. all of which can greatly benefit from mode 1.

So far, the only configuration I saw that would benefit from mode 2 is two browsers doing data channel.
There are indeed some valid use cases there.
Whenever one of the two peer is not a browser, mode 3 on the browser is usually good enough.

As of mode 1, Safari is not supporting it even when getUserMedia access is granted. I doubt that a specific permission would actually help moving this forward.

Now, UX is hard. This is a second-tier category of permission, not good for a modal prompt.

Non blocking UI has some advantages, it also has some drawbacks.
As a user, in most cases, I will not notice and not care of this new permission since it will not block my workflow and will often have no impact on my user experience.

In the cases it might actually help granting this permission, I will have difficulties relating this permission with the issues I am facing. The only chance is the website to guide me through accepting this permission. Should they do that upfront (bad for privacy) or just when I have an issue (bad for UX)? If I grant permission, and the connection does not happen, is it some kind of a trap?

If I discover this permission, it might be hard for me to understand the implications behind that choice.
What is 'direct-connection'? What am I revealing? What is mode 1, mode 2? If I am disallowing the 'direct-connection', does it mean I only allow TURN candidates?
As a UX, it looks to me like an option for power users.

UX is hard, and there may be other ways to handle that.
Having this permission surfacing as a web concept might be a good idea.
It might allow describing what browsers are doing and what browsers might do in the future.

adding this permission might actually help free initial connection from being behind a permission prompt.

I also somehow like this idea.
At the same time, web developers can already do it now.
If they do not do it, maybe that is not such a big user experience issue.

@jan-ivar
Copy link
Member

jan-ivar commented Jun 7, 2019

UX is hard, and there may be other ways to handle that.
Having this permission surfacing as a web concept might be a good idea.

Exactly. UX is best left to user agents to explore; whether modal or something like Firefox's block-first config-like autoplay permission—or even ignore it—works best for them. There's likely not one right answer here. But this PR works with all of these models, so we don't block on UX.

I think what blocks experimentation today, is no-one wants prompts on all mode 3 WebRTC calls ahead of gUM.

The two critical parts I see addressed by this PR are:

  1. User agents today have no way to distinguish an app that is content with mode 3 from one that would suffer a prompt.
  2. User agents have no mechanism to inform an app that its trust level has increased after it has connected.

adding this permission might actually help free initial connection from being behind a permission prompt.

I also somehow like this idea.
At the same time, web developers can already do it now.
If they do not do it, maybe that is not such a big user experience issue.

True, though given the complexity of just getting negotiation right in all browsers today, I wouldn't read too much into the lack of experimentation in the negotiation realm just yet.

I do hear people find the tight coupling between getUserMedia and peer connections troubling.

@youennf
Copy link
Contributor

youennf commented Jun 7, 2019

The two critical parts I see addressed by this PR are:

  1. User agents today have no way to distinguish an app that is content with mode 3 from one that would suffer a prompt.

The question might be:
Is there a user agent that would like to prompt user to go from mode 3 to mode 2 or 1 if the web page is asking specifically for that?

  1. User agents have no mechanism to inform an app that its trust level has increased after it has connected.

This seems unnecessary right now given getUserMedia is the only switch.
Heuristics that would apply at page load time do not need that either, the app will take whatever candidates it can.

This mechanism becomes useful if there are heuristics to change the trust level during the lifetime of a page (say, browser sees that a TURN based connection is actively used by a page and grants access afters some time).
Are there plans to design and implement such kind of heuristics?

@youennf
Copy link
Contributor

youennf commented Jun 7, 2019

I do hear people find the tight coupling between getUserMedia and peer connections troubling.

I agree, it is great if we can find something better.

@lgrahl
Copy link
Contributor Author

lgrahl commented Jun 8, 2019

The only chance is the website to guide me through accepting this permission. Should they do that upfront (bad for privacy) or just when I have an issue (bad for UX)?

That depends on the use case. There are three examples in the proposal.
Guiding the user may be necessary, yes. (It would be interesting to have a web API that helps in guiding, for example by allowing specific events - a button press - to trigger opening the permission panel... but that is out of scope for this proposal.)

What is 'direct-connection'? What am I revealing?

What a direct connection means and reveals probably needs to be explained by the UI. There could be a help icon next to Direct Connection in the screenshot presented above. But that is just an idea.

As a reminder: We already have an implicit direct connection permission mechanism due to the privilege escalation of getUserMedia. We have just avoided explaining it to the user so far by hiding it behind that mechanism... which has always been controversial.

If I am disallowing the 'direct-connection', does it mean I only allow TURN candidates?

There is no rejection/downgrade mechanism, so the question whether denial would mean only allowing TURN candidates does not arise.
I'm open for a more fitting name if there is one that would not cause such confusion and still explains what it is.

Are there plans to design and implement such kind of heuristics?

To whom is this question directed? User agents or applications?
User agents: No idea but I dislike such kind of implicit magic if not controlled by the application.
Applications: Well, I would use it. Not if on relay candidates but if on anything worse than or equal to server reflexive candidates. In fact, this is the first example of the API.

@alvestrand
Copy link
Contributor

alvestrand commented Jun 12, 2019

Conclusions from the VI meeting where this was discussed (my summary):

  • The idea of modeling permission for direct connection as a separate permission was thought a Good Idea. This part (including doing the formalities with the permissions spec, and possibly with the features spec) should be done ASAP.
  • The specific proposed API was more controversial; in particular, whether the notification mechanisms proposed make sense given that there's already notification mechanisms for permission changes in the permissions API.
  • The basic requirements that this mechanism are intended to fulfil need to go into the webrtc-nv requirements specification. The API can be an extension spec.

More work needed.

Copy link
Contributor

@alvestrand alvestrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this needs splitting into 3 CLs:

  • Permission name and values (permissions spec)
  • Use cases and justification (webrtc-nv-use-cases)
  • API for requesting the permission (either here or an extension spec).

webrtc.html Outdated
<p>The Direct Connection Permission API extends the <code><a>RTCPeerConnection</a></code> interface as described below.</p>
<div>
<pre class="idl">partial interface RTCPeerConnection {
static void registerDirectConnectionInterest ();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using "static" forces all PeerConnections to have the same interest. Is this required? Wise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale is that:

  1. A peer connection is invisible in terms of UI, so there's no way to distinguish them for the user.
  2. Granting a permission would have an effect on the same origin, so all will be affected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's defining the API based on currently imaginable UI.
I don't think that's a wise design principle.
One could imagine other ways to determine permission, for instance that signed apps have different permissions from unsigned apps, or even that Javscript running inside some future sandbox (other than an iframe) had yet another means of deciding on permission.

We should shape the API so that an app developer can give information to the UA that is relevant to what he's working on; the developer of a conferencing system should not have to worry about what the ad-supported clicktracker on the same page is telling the UA (and vice versa).

Copy link
Contributor Author

@lgrahl lgrahl Aug 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think that is reasonable. Any thoughts on that, @jan-ivar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with normal method, I'd like to add a note that the UA may consider to treat it as a request towards that particular RTCPeerConnection instance or increase the scope to the same origin.

@feross
Copy link

feross commented Jul 7, 2019

Btw, it's worth keeping in mind that some browsers already see fit to show permission prompts that many non-technical users are likely to find very confusing:

MIDI devices:

Screen Shot 2019-07-07 at 1 37 49 PM

Bluetooth devices (exposing MAC addresses directly to the user):

Screen Shot 2019-07-07 at 1 37 57 PM

It seems likely than any user who could understand the permission prompts I have shown above would be quite likely to understand a "direct connection" permission prompt.

@martinthomson
Copy link
Member

Great examples of terrible prompts. Was there a point to the comment (other than the lulz)?

Seriously, the notion of notice and consent is not a useful one except in very narrow contexts. I think that I can find a path to justifying microphone and camera prompts (maybe), but this is too much for me.

@jan-ivar
Copy link
Member

jan-ivar commented Jul 8, 2019

I doubt Firefox would ever explore a modal prompt for this. I suspect a "block first" approach is better.

But that doesn't mean we cannot expose the permission. For instance, we could expose what is implicit today, which is that granting camera or microphone access has this IP leaking side effect (mockup):

This might even help Firefox explore persisting this "direct connection" permission after initial gUM, for parity with Chrome (which is why I drew it as "Allowed" instead of "Allowed Temporarily".

We've always been adamant that camera and microphone permission should not be persisted by default, to prevent uninitiated snooping, but I don't think those same concerns extend to IP mode. That part sort of just happened.

@youennf
Copy link
Contributor

youennf commented Jul 9, 2019

The BT prompt seems especially hard to reason about.
And the more a user faces such prompts, the more the user might disregard all prompts.

This might even help Firefox explore persisting this "direct connection" permission after initial gUM, for parity with Chrome (which is why I drew it as "Allowed" instead of "Allowed Temporarily".

That would further entice developers in calling getUserMedia when they do not have host candidates (for good or bad reasons) since this will be a one-time call.

Ideally, we would be able to identify legit uses of RTCPeerConnection.
Or we would identify cases where leaking the host candidate is not a privacy issue.

@jan-ivar
Copy link
Member

That would further entice developers in calling getUserMedia when they do not have host candidates (for good or bad reasons) since this will be a one-time call.

It's already a one-time-call in Chrome today, so I think they're sufficiently enticed. 😉

By exposing it, users can at least see what is happening, and even revoke it with the X

Ideally, we would be able to identify legit uses of RTCPeerConnection.

Not possible I think. We'll just end up boosting our successful connection numbers.

Or we would identify cases where leaking the host candidate is not a privacy issue.

That's getUserMedia. Once gUM is granted, its fingerprint surface dwarfs 192.168.1.x.

@youennf
Copy link
Contributor

youennf commented Jul 10, 2019

That would further entice developers in calling getUserMedia when they do not have host candidates (for good or bad reasons) since this will be a one-time call.

It's already a one-time-call in Chrome today, so I think they're sufficiently enticed. 😉

For the prompt yes. To get host candidates, I believe they would need to call getUserMedia each time, which can be visible and very intriguing/creapy to users.

By exposing it, users can at least see what is happening, and even revoke it with the X

Right, this is an option I like somehow.
With the particular UI illustration, it would seem weird though to expose the information and then allow the user to no longer expose it.

Ideally, we would be able to identify legit uses of RTCPeerConnection.

Not possible I think. We'll just end up boosting our successful connection numbers.

Not to say this would improve much the connection rate, but a connection that is providing content for a large video element visible in a page seems like a legit case. If the connection is using a TURN candidate, there could be room for optimization.

Or we would identify cases where leaking the host candidate is not a privacy issue.

That's getUserMedia. Once gUM is granted, its fingerprint surface dwarfs 192.168.1.x.

IPv6 temporary addresses are cheaper and have less fingerprint risks.
If a device could dynamically be assigned to several temporary IPv6 addresses, the risk would further decrease.

@alvestrand
Copy link
Contributor

Given that this has received some added urgency due to the impending mDNS deployment, could we get on with restructuring this proposal according to #2175 (review) ?

@lgrahl
Copy link
Contributor Author

lgrahl commented Jul 10, 2019

I'll try to come back to it asap.

@jan-ivar
Copy link
Member

jan-ivar commented Jul 13, 2019

Another lulz (albeit an experimental one behind a resistFingerprinting pref in Firefox):

My 2 cents is that even people sensitive to privacy are not signing up for prompts galore. Just block it.

Sorry for spamming, but it seemed to have the same motivation as this issue (and the same difficulty with expressing the problem of why users should care).

@lgrahl
Copy link
Contributor Author

lgrahl commented Jul 16, 2019

* Permission name and values (permissions spec)

IIRC Youenn opened the discussion regarding the name of the permission. I have not found a better fitting name than direct-connection even though it obviously does not prevent a direct connection from being established without granting it.
What the permission actually does is switch between the best available IP handling mode and the default IP handling mode. Now, note that the meaning of these two modes is up to the user agent and perhaps even up to the browser settings. Thus, the most fitting name I can think of is best-available-ip-handling-mode... which is not what I'd suggest to use. Just clarifying, it's hard to find a good name for the permission and if we want something other than direct-connection, I'm open for suggestions.

This is the proposed change towards the permission spec so far, PTAL: w3c/permissions@master...lgrahl:direct-connection
Will make a PR tomorrow.

* Use cases and justification (webrtc-nv-use-cases)

NV Use Cases PR is here: w3c/webrtc-nv-use-cases#14
Anything else I need to do here?

* API for requesting the permission (either here or an extension spec).

Harald, can you clarify again what you think should be changed regarding the API? Jan-Ivar and I tend towards keeping the static method as it seems useful in describing the application's intent more clearly.
I'm not entirely sure about the event, yet.

@lgrahl
Copy link
Contributor Author

lgrahl commented Jul 25, 2019

Ping @alvestrand.

@alvestrand
Copy link
Contributor

Replied on the "static" issue inline.

Note on the PR for permission: It presupposes that the "direct connection API" is added to the Webertc-pc spec, which I don't think is advisable to depend on. If you change the text in that PR to say that it "gives the permission to use a higher than default value of the IP addressing mode from [[rtcweb-ip-handling]]", it will be a standalone PR.

@lgrahl
Copy link
Contributor Author

lgrahl commented Aug 5, 2019

@alvestrand: Incorporated your suggestion in the PR, PTAL.

I'll wait for @jan-ivar's feedback regarding the method and we still need to finish the discussion in w3c/webrtc-nv-use-cases#14.

@lgrahl
Copy link
Contributor Author

lgrahl commented Aug 14, 2019

@alvestrand: Removed static from the method and updated the examples in 19799a0. Note that the currently described mechanism does still say that, if the permission has been granted, it will upgrade all RTCPeerConnection instances of the same origin.

(Reminder for w3c/webrtc-nv-use-cases#14 unless we want to get this into webrtc-pc first, @henbos @alvestrand)

@youennf
Copy link
Contributor

youennf commented Aug 14, 2019

#2175 (review) review says:
"API for requesting the permission (either here or an extension spec)."

Looking at the PR, it seems it would work nicely as an extension spec.
It could advance at its own pace in terms of spec and implementation.

@lgrahl
Copy link
Contributor Author

lgrahl commented Sep 9, 2019

Can't progress here or in w3c/webrtc-nv-use-cases#14 unless I'm getting feedback from the pinged persons.

@alvestrand
Copy link
Contributor

I've OKed the permissions PR.
We're now even closer to the CR deadline, at which all features must have implementations or die, so I think an extension spec is the way to go.

@alvestrand
Copy link
Contributor

Closing this PR. Aiming for an extension spec seems reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Obtain user consent for one-way media and data use cases
9 participants