-
Notifications
You must be signed in to change notification settings - Fork 0
/
ecology.html
69 lines (65 loc) · 4.14 KB
/
ecology.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
<h2>Data Ecology: Understanding and Designing Data Ecosystems</h2>
<h3>Raul Castro Fernandez, The University of Chicago</h3>
<p>
Data shapes our economic, political, and social ecosystems. However, we have
little control over data's effect on those ecosystems, and its influence can
distort, manipulate, and even undermine them, leading to undesirable
consequences. Like rivers, data affects ecosystems through flows. Analyzing
these dataflows reveals how data is used (and misused) and uncovers
opportunities to harness its value. Dataflows may generate value, such as when
hospitals share patient data to improve care. They can also cause harm, such as
when individuals' data is sold to self-serving data brokers. Non-existing
dataflows are equally significant; the lack of sharing among competitive actors,
such as banks and governments, leaves much potential value unrealized. Despite
dataflows' outsized impact on our lives, we have little insight into what drives
them and lack integrated means to control them when they are harmful. Controls
boil down to regulation (legal instruments), incentives (economic instruments),
and privacy-enhancing technologies (technical instruments) that are today
independently developed and whose effectiveness we do not truly understand.
</p>
<p>
My research agenda, which I call Data Ecology, aims to uncover the principles
that cause dataflows and to design interventions - technical, economic, and
legal - to steer them in a beneficial direction. Given a goal (a desirable
outcome) for a data ecosystem (such as a company, city, or government), what
interventions shall we engineer so that agents' actions lead to that goal? The
research line on data ecology includes: i) formalizing this question; ii)
designing new interventions (examples below); and iii) evaluating interventions'
ability to steer dataflows in diverse data ecosystems. While some literature has
explored these questions, data ecology offers a new lens and perspective that
brings existing work into a common framework to help us advance our
understanding.
</p>
<p>
My group has studied many data ecosystems by applying this dataflow lens,
including data sharing and data markets; we use the latter to illustrate some
data ecology interventions. Data marketplaces suffer from Arrow's Information
paradox. Sellers will not release data to buyers before payment (there is no
"try-before-you-buy" with non-rival goods such as data), and buyers will not pay
before understanding the data's benefits; consequently, few transactions occur,
even when beneficial. By applying data ecology's dataflow lens to marketplaces,
we identified the uncertainty faced by buyers and sellers as the culprit of poor
performance, and that helped us design a technical intervention to address it, a
data escrow. Sellers register their data with the escrow, and buyers delegate
computation that signals the data's value. For example, the escrow can train and
evaluate an ML model on a seller's dataset and tell the buyer about the
performance improvement without revealing the raw data. This escrow intervention
reduces uncertainty for sellers and buyers, causing data to flow when
beneficial. The data escrow is just one technical intervention; we have combined
data escrows with economic incentives to facilitate the formation of
data-sharing consortia (e.g., among banks and government agencies) and create
beneficial dataflows that do not occur naturally. We are also studying
techno-legal interventions in data ecosystems. Looming regulations and a society
growing uneasy with the current data ecosystem may force changes soon. And if
change is coming, we are better off understanding the effect of interventions on
the data ecosystem.
</p>
<p>
While these interventions are valuable in their own right, the ultimate goal of
data ecology is to provide a general theory and mechanisms for understanding and
controlling dataflows. Data shapes our world, but the final form need not be
fixed. Data ecology provides tools to shape it so it is compatible with our
values. These tools are more critical than ever as data's influence on our world
broadens and intensifies.
</p>
Go back to <a href="http://raulcastrofernandez.com">the main page.</a>