From c627263f177e01fcdc300503b07cc8bf9eea857a Mon Sep 17 00:00:00 2001 From: Jeffrey Yasskin Date: Thu, 5 Oct 2023 17:12:43 -0700 Subject: [PATCH 1/7] Rewrite ancillary uses to focus on 2 kinds of ancillary APIs rather than ancillary data. --- index.html | 165 ++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 114 insertions(+), 51 deletions(-) diff --git a/index.html b/index.html index e5be81a9..813ead16 100644 --- a/index.html +++ b/index.html @@ -1216,37 +1216,91 @@ ### Ancillary uses -In order to uphold the principle of [[[#data-minimization]]], [=sites=] and +[=Sites=] sometimes use data in ways that aren't needed for the user's immediate +goals. For example, they might bill advertisers, measure site performance, or +tell developers about bugs. These uses are known as ancillary uses. + +[=Sites=] can get the data they want for [=ancillary uses=] from a variety of places: + +
+
Non-ancillary APIs
+
+ Web APIs that were designed to support users' immediate goals, like DOM events and element position + observers. +
+ +
Summarizing ancillary APIs
+
+ APIs that filter or summarize information available from the first group of + APIs, like the [[[event-timing]]] and IntersectionObserver. +
+ +
Novel ancillary APIs
+
+ APIs that provide new information that's primarily useful to support the + ancillary uses, like element paint + timing, memory usage + measurements, and deprecation + reports. +
+
+ +All of these sources of data can reveal [=personal data=] about a person's +configuration, device, environment, or behavior that could be sensitive or be used as part of browser +fingerprinting to recognize people +across contexts. In order to uphold the principle of [[[#data-minimization]]], [=sites=] and [=user agents=] should seek to understand and respect people's goals and preferences about -use of data about them. +use of this data. -[=Sites=] sometimes use data in ways that aren't needed for the user's immediate -goals. These uses are known as ancillary uses, -and data that is primarily useful for [=ancillary uses=] is ancillary data. +[=User agents=] usually can't do much about data collection from the +[=non-ancillary APIs=]. - +There is disagreement about how [=user agents=] should handle [=summarizing +ancillary APIs=]. Advocates of these APIs argue that they're hard to use to +extract [=personal data=], they're more efficient than collecting the same +information though [=non-ancillary APIs=], sites are less likely to adopt these +APIs if a significant number of people turn them off, and that the act of +turning them off can ironically contribute to [=browser fingerprinting=]. +Opponents argue that if data's easier or cheaper to collect, more sites will +collect it, and because there's still some risk and cost, users should be able +to turn off this group of APIs that probably won't directly break a site's +functionality. -Different [=users=] will want to share different kinds and amounts of -[=ancillary data=] with [=sites=]. Some [=people=] will not want to share any -[=ancillary data=] at all. +We do have consensus that [[[#information]]] governs [=summarizing ancillary +APIs=] and that: -Users may be willing to share [=ancillary data=] if it is aggregated with -the data of other users, or [=de-identified=]. This can be useful -when [=ancillary data=] contributes to a collective benefit in a way -that reduces privacy threats to individuals (see collective -privacy). +
+Specifications for [=summarizing +ancillary APIs=] and [=novel ancillary APIs=] should identify them as such, so +that [=user agents=] can provide appropriate choices for their users. +
- -[=User agents=] should aggressively minimize [=ancillary -data=] and should avoid burdening the user with additional [=privacy labor=] -when deciding what [=ancillary data=] to expose. To that end, user agents may -employ user research, solicitation of general preferences, and heuristics about -sensitivity of data or trust in a particular [=context=]. - -To help [=sites=] understand user preferences, user agents can provide -browser-configurable signals to directly communicate common user preferences -(such as a [=global opt-out=]). - -Data exposed for the [=ancillary uses=] of telemetry and analytics may reveal -information about user configuration, device, environment, or behavior that -could be used as part of browser fingerprinting to identify users across -sites. Revealing user preferences or other heuristics in providing or disabling -functionality could also contribute to a browser fingerprint. - -Functionality for telemetry and analytics should be explicitly noted by -specification authors, to help [=user agents=] provide configuration options -to their users. +Some [=ancillary uses=] don't require their data to be related to a person, but +the useful aggregations across many people are difficult to design into a web +API, or they might require new technologies to be invented. API designers have a +few choices in this situation: + +* Sometimes an API can [=de-identify=] the data instead, but this is difficult + if a web page has any input into the data that's collected. +* API designers can check carefully that the API doesn't reveal _new_ [=personal + data=], as described by [[[#information]]]. For example, the API might reveal + that a person has a fast graphics card, that they click slowly, or that they + use a certain proxy, but the fact that they click slowly is already revealed + by DOM event timing. +* [=User agents=] can ask their users' permission to enable this class of API. + To reduce [=privacy labor=], a [=user agent=] could use a first-run dialog to + ask the user whether they generally support sharing this data, rather than + asking for each use of the APIs. + +API designers should maintain APIs that had to make one of these choices and +should keep trying to evolve them toward aggregating the data instead. + +Some other [=ancillary uses=] do require that a person be connected to their +data. For example, a person might want to file a bug report that a website +breaks on their particular computer, and be able to get follow-up communication +from the developers while they fix the bug. This is an appropriate time to ask +the person's permission. - +
+User +agents should provide a way to disable [=novel ancillary APIs=]. +
+Some people may want to save processing time or bandwidth that's not necessary +to achieve their immediate goals, or they might know something about their +specific situation that makes the API designers' general decisions inappropriate +for them. Because the information provided by [=novel ancillary APIs=] isn't +available in any other way, [=user agents=] should let people turn them off, +despite the additional risk of [=browser fingerprinting=]. ## Information access {#information} @@ -1508,7 +1571,7 @@ -Data is de-identified when there exists a high level of confidence +Data is de-identified when there exists a high level of confidence that no [=person=] described by the data can be identified, directly or indirectly (e.g. via association with an [=identifier=], user agent, or device), by that data alone or in combination with other available information. Note From 6e3bfbebe0e988821516e3e8c2ea56a18cd69f10 Mon Sep 17 00:00:00 2001 From: Jeffrey Yasskin Date: Thu, 5 Oct 2023 17:26:02 -0700 Subject: [PATCH 2/7] De-duplicate an id. --- index.html | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/index.html b/index.html index 813ead16..4d7f3cb2 100644 --- a/index.html +++ b/index.html @@ -1276,10 +1276,10 @@ APIs=] and that:
-Specifications for [=summarizing -ancillary APIs=] and [=novel ancillary APIs=] should identify them as such, so -that [=user agents=] can provide appropriate choices for their users. +Specifications +for [=summarizing ancillary APIs=] and [=novel ancillary APIs=] should identify +them as such, so that [=user agents=] can provide appropriate choices for their +users.
#### Designing novel ancillary APIs {#designing-novel-ancillary-apis} From 8a9efbb062c78df08120d1b755c7463b3736f56d Mon Sep 17 00:00:00 2001 From: Jeffrey Yasskin Date: Fri, 6 Oct 2023 19:59:28 -0700 Subject: [PATCH 3/7] Avoid "first group", and name it instead. --- index.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/index.html b/index.html index 4d7f3cb2..94f8d69f 100644 --- a/index.html +++ b/index.html @@ -1234,8 +1234,8 @@
Summarizing ancillary APIs
- APIs that filter or summarize information available from the first group of - APIs, like the [[[event-timing]]] and IntersectionObserver.
From f6fac98d6669e3f29449a047e99c7385cbf6936d Mon Sep 17 00:00:00 2001 From: Jeffrey Yasskin Date: Mon, 9 Oct 2023 11:56:39 -0700 Subject: [PATCH 4/7] Add time-shifting. --- index.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/index.html b/index.html index 94f8d69f..d3dafd9e 100644 --- a/index.html +++ b/index.html @@ -1234,8 +1234,8 @@
Summarizing ancillary APIs
- APIs that filter or summarize information available from [=non-ancillary - APIs=], like the [[[event-timing]]] and IntersectionObserver.
From c7ef27fd0b8ec055d1418b7b03574db9f2290179 Mon Sep 17 00:00:00 2001 From: Jeffrey Yasskin Date: Thu, 9 Nov 2023 17:15:06 -0800 Subject: [PATCH 5/7] Address some code review comments. --- index.html | 48 ++++++++++++++++++++++++++++-------------------- 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/index.html b/index.html index d3dafd9e..1f3b6692 100644 --- a/index.html +++ b/index.html @@ -1232,14 +1232,16 @@ observers. -
Summarizing ancillary APIs
+
Ancillary APIs computed from existing information
APIs that filter, summarize, or time-shift information available from [=non-ancillary APIs=], like the [[[event-timing]]] and IntersectionObserver. + data-cite="intersection-observer#introduction">IntersectionObserver. See + [[#information]] for restrictions on how existing non-ancillary APIs can be + used to justify new ancillary APIs.
-
Novel ancillary APIs
+
Ancillary APIs that provide new information
APIs that provide new information that's primarily useful to support the ancillary uses, like element paint @@ -1261,34 +1263,36 @@ [=User agents=] usually can't do much about data collection from the [=non-ancillary APIs=]. -There is disagreement about how [=user agents=] should handle [=summarizing -ancillary APIs=]. Advocates of these APIs argue that they're hard to use to +There is disagreement about how [=user agents=] should handle [=ancillary APIs +computed from existing information=]. Advocates of these APIs argue that they're +hard to use to extract [=personal data=], they're more efficient than collecting the same information though [=non-ancillary APIs=], sites are less likely to adopt these APIs if a significant number of people turn them off, and that the act of -turning them off can ironically contribute to [=browser fingerprinting=]. +turning them off can contribute to [=browser fingerprinting=]. Opponents argue that if data's easier or cheaper to collect, more sites will -collect it, and because there's still some risk and cost, users should be able +collect it, and because there's still some risk, users should be able to turn off this group of APIs that probably won't directly break a site's functionality. -We do have consensus that [[[#information]]] governs [=summarizing ancillary -APIs=] and that: +We do have consensus that [[[#information]]] governs [=ancillary APIs computed +from existing information=] and that:
Specifications -for [=summarizing ancillary APIs=] and [=novel ancillary APIs=] should identify -them as such, so that [=user agents=] can provide appropriate choices for their -users. +for [=ancillary APIs computed from existing information=] and [=ancillary APIs +that provide new information=] should identify them as such, so that [=user +agents=] can provide appropriate choices for their users.
-#### Designing novel ancillary APIs {#designing-novel-ancillary-apis} +#### Designing ancillary APIs that provide new information {#designing-ancillary-apis-with-new-information}
[=Novel -ancillary APIs=] should not reveal any [=personal data=] that isn't already -available through other APIs, without permission. +id="principle-ancillary-apis-with-new-information-shouldnt-reveal-personal-data"> +[=Ancillary APIs that provide new information=] should not reveal any [=personal +data=] that isn't already available through other APIs, without permission. +
Most [=ancillary uses=] don't require that a site learn any [=personal data=]. @@ -1319,7 +1323,8 @@ * API designers can check carefully that the API doesn't reveal _new_ [=personal data=], as described by [[[#information]]]. For example, the API might reveal that a person has a fast graphics card, that they click slowly, or that they - use a certain proxy, but the fact that they click slowly is already revealed + use a certain proxy, but the fact that they click slowly is already +
unavoidably revealed by DOM event timing. * [=User agents=] can ask their users' permission to enable this class of API. To reduce [=privacy labor=], a [=user agent=] could use a first-run dialog to @@ -1336,14 +1341,17 @@ the person's permission.
-User -agents should provide a way to disable [=novel ancillary APIs=]. + +User agents should provide a way to disable [=ancillary APIs that provide new +information=]. +
Some people may want to save processing time or bandwidth that's not necessary to achieve their immediate goals, or they might know something about their specific situation that makes the API designers' general decisions inappropriate -for them. Because the information provided by [=novel ancillary APIs=] isn't +for them. Because the information provided by [=ancillary APIs that provide new +information=] isn't available in any other way, [=user agents=] should let people turn them off, despite the additional risk of [=browser fingerprinting=]. From ee991e1061caca02f673385abed63d31751f9f6f Mon Sep 17 00:00:00 2001 From: Jeffrey Yasskin Date: Wed, 15 Nov 2023 09:36:05 -0800 Subject: [PATCH 6/7] Fix a section link. --- index.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/index.html b/index.html index 1f3b6692..23c9cb52 100644 --- a/index.html +++ b/index.html @@ -1237,8 +1237,8 @@ APIs that filter, summarize, or time-shift information available from [=non-ancillary APIs=], like the [[[event-timing]]] and IntersectionObserver. See - [[#information]] for restrictions on how existing non-ancillary APIs can be - used to justify new ancillary APIs. + [[[#information]]] for restrictions on how existing non-ancillary APIs can + be used to justify new ancillary APIs.
Ancillary APIs that provide new information
From b2e5a25daeec97eb16bec76a20c3e25d180af94c Mon Sep 17 00:00:00 2001 From: Jeffrey Yasskin Date: Wed, 15 Nov 2023 10:44:36 -0800 Subject: [PATCH 7/7] Take suggestions from the 2023-11-15 meeting. --- index.html | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/index.html b/index.html index 23c9cb52..9825cfe6 100644 --- a/index.html +++ b/index.html @@ -1260,12 +1260,9 @@ [=user agents=] should seek to understand and respect people's goals and preferences about use of this data. -[=User agents=] usually can't do much about data collection from the -[=non-ancillary APIs=]. - -There is disagreement about how [=user agents=] should handle [=ancillary APIs -computed from existing information=]. Advocates of these APIs argue that they're -hard to use to +The task force does not have consensus about how [=user agents=] should handle +[=ancillary APIs computed from existing information=]. +Advocates of these APIs argue that they're hard to use to extract [=personal data=], they're more efficient than collecting the same information though [=non-ancillary APIs=], sites are less likely to adopt these APIs if a significant number of people turn them off, and that the act of @@ -1275,8 +1272,7 @@ to turn off this group of APIs that probably won't directly break a site's functionality. -We do have consensus that [[[#information]]] governs [=ancillary APIs computed -from existing information=] and that: +Because different users are likely to have different preferences:
Specifications @@ -1291,7 +1287,8 @@ [=Ancillary APIs that provide new information=] should not reveal any [=personal -data=] that isn't already available through other APIs, without permission. +data=] that isn't already available through other APIs, without an indication +that doing so aligns with the user's wishes and interests.
@@ -1331,8 +1328,9 @@ ask the user whether they generally support sharing this data, rather than asking for each use of the APIs. -API designers should maintain APIs that had to make one of these choices and -should keep trying to evolve them toward aggregating the data instead. +If an API had to make one of these choices, and then something else about the +API needs to change, designers should consider replacing the whole API with one +that avoids exposing [=personal data=]. Some other [=ancillary uses=] do require that a person be connected to their data. For example, a person might want to file a bug report that a website