-
Notifications
You must be signed in to change notification settings - Fork 6
/
data-planner.Rmd
163 lines (82 loc) · 16.9 KB
/
data-planner.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
# OHI Data Planner {#data-planner}
The purpose of Chapter \@ref(planner-guide) is to help you think about data for your assessment in a structured way. This is a 1-hour hands-on training: you will be following along on your own computer and working with a copy of the demonstration repository that is used throughout this chapter.
## Overview
A successful OHI+ assessment depends on representative data and models. With so much (or sometimes so little) information out there, how do you select the optimal data to be included in your study? How do you develop models that best reflect the conditions of your ocean health? How do you keep track of ideas you have and decisions you make? How do you work with your team so that you can contribute to ideas? Here we have developed a OHI Planner to guide you through this important, yet sometimes daunting, process of data selection and model exploration.
We'll start by exploring the columns in the spread sheet and what they are for, and then walk through examples of how to fill them out.
### Prerequisites
The [OHI Data Planner](https://docs.google.com/spreadsheets/d/1f-vwUrQHrMs7cjhChuK2YnlbkE_5kQ6AgGwyV2eJ1vA/edit?usp=sharing) is a document that can help you organize your thinking around the data available and how it could be used in your OHI+ assessment.
Here we will walk through the OHI Data Planner together.
## Data Planner spreadsheet
Information from the spreadsheet is not going to be used directly by the Toolbox. Rather, this Data Planner is built to start the collaborative OHI process and help you think about data. The Planner will help your whole team design your plan and contribute ideas. Then, the technical team will be able to translate this plan into calculations in the Toolbox.
The Data Planner serves as a way to track ideas of data and see what they look like or if they are possible to use for your assessment. It seems daunting to have to write down what you want the models to be, but here you can record simply what you're thinking, and what the ideal would be. It's a starting point that you can then share with your colleagues to think about together, and start looking for data.
The data searching process is not a definitive phase of an assessment. You might find your initial data ideas not adequate or feasible to use for calculations later, or they don't fit into the spatial boundaries you wanted to use. That's okay. OHI is an iterative process, and you can always come back to this spreadsheet and edit further. Having this spreadsheet thus serves as a collaborative note for your and your colleagues to record ideas for future references.
There are three tabs to be filled out: Status Data, Pressures Data, and Resilience Data. Now let's take a look at each one of them.
## Status Data tab
Open up the "Status Data" tab, you'll see these headings, or columns to be evaluated for each goal:
![](https://docs.google.com/drawings/d/e/2PACX-1vQCvSiXZdlEiThyX6_DWryKDOZImQFj_0dtVL4Co6YvupBx-60mx3amZqJ023vxRPhILeHehPxn9MIf/pub?w=960&h=96)
This sheet is colored to help structure your thinking as you consider data. Generally you would want to work through this Planner from left to right. But note that _the process of building models, finding data, and defining spatial boundaries is iterative_. You might work on those components simultaneously in order to strike a balance between an ideal approach and data limitations.
Let's walk through each header first.
### Goals and subgoals
![](https://docs.google.com/drawings/d/e/2PACX-1vQF_jJF0FgtFoZiGf8oMEacVslyQZdO7uoYEvRi5_mEG2kCcR--t34mMj7GYrdBnv4M7RCNY-81Nvgw/pub?w=960&h=144)
In the Global assessment, we identified ten “goals” - universally valued benefits provided by the ocean. However, they might not be as relevant in your region. You may remove the goals that are irrelevant on the spreadsheet.
The Baltic Sea region doesn't get big storms because of its geography. The Coastal Protection goal is therefore irrelevant here and removed from this study.
### Model + Data + Spatial Boundaries
![](https://docs.google.com/drawings/d/e/2PACX-1vQwRzxEuSxD09EyNUY5igXS7RdQFnk5XrAr3RXYkdBGzDIEMk73xWfR78urrju4oYTAy1TzRBJzFWFd/pub?w=960&h=144)
Goal Models and Spatial Boundaries need to be considered simultaneously, particularly in the initial data gathering stage. You may choose to set spatial boundaries based on political or geological boundaries (or some combination), and your choice will be either validated or need adjusted based on the data that are available for those boundaries. For example, let's say you choose to assess 5 provinces separately. Then, as you gather available data, you learn that data for province 1 and 2 are reported together and are difficult to tease apart. In this case, you will need to consider combining region 1 and 2 into one region in your assessment, or looking for other data sources that are available to represent this goal. It is also important to look across other goals and the data available for those goals; do most data sets report province 1 and 2 together, or is this the only one? This will help you decide whether you would want to explore new spatial boundaries or new data sources.
Let's look at these individually:
**Represent goals with models**
![](https://docs.google.com/drawings/d/e/2PACX-1vTFUwZQ6MKe-1ihCUMDWAv4yuX-D8oQnLYrGS4dEbMYIugvnj8j94sfquN9g09SMeHzg2GzKY4QyvPs/pub?w=960&h=125)
_Model approaches?_ You don't need to decide on the mathematical formula here. Together with the next "variables" column, this column mainly aims to get you start thinking of model approaches. Is the global OHI model adequate for your needs? What should the goal capture?
_What ideal variables best represent the goal?_ Before even thinking about data, let's think broadly, in your study area, what variables, or characteristics, of a goal best represent the philosophy of it, regardless if there is data available?
Sometimes we tend to constrain our model development with the data we know we have. You can build a more representative model by thinking through what are _the most ideal variables_ (eg. popularity of a region among tourists) to capture information, then find _data that best reflect those variables_ (eg. hotel vacancy rates? international flights to this region? surf shop revenue? cruise ship revenue? etc).
Different OHI+ projects have been creative at how they represent this goal as data permits. Check out the OHI-Science Goals page ([ohi-science.org/goals](http://ohi-science.org/goals/)) page. It can help you understand the philosophy of a goal by illustrating what has been done in other locations. Let that catalyze your discussions. Are any of the past approaches suitable, or are there other obvious variables to better represent your goal in your country/region?
Another important factor that determines OHI scores is _Reference Point_, ie. what status of this goal can be awarded a score of 100? Has the government set a target to achieve? Is there a model region that has become a standard other regions should aspire to? Is there a historical point the regions should return to?
**Find data for your ideal or proxy model variables**
![](https://docs.google.com/drawings/d/e/2PACX-1vSoBJMg2MbiwD7xkQQEOD-v6YqOpajRpIbKeawYSfimytwxCPe1aSTyv0HmLfV6O5RCoqf9HqncZBtf/pub?w=960&h=125)
Choice of model variables and reference points are often limited by data availability, time frame, and resolution. The data columns are a place to catalog a list of the data that could work: from your knowledge, when you ask your colleagues, digging through databases online, etc. Add as many rows as you need to to record your ideas. Having this master sheet of ideas will help you keep track of your thoughts and make collaboration with your colleagues easier.
_Years_: Typically having only one year of data is not enough to objectively reflect the status and trend of a goal. Five years of data are recommended. If that's not available, using fewer years of data is possible.
_Spatial resolution_: Data needs to be presented to the _region_ level. Regions are sub-units, or assessment units, of your study area. At the region level, each OHI goals/sub-goals will be analyzed and scored. For example, you study area may be an entire country, while you regions are each coastal province or state. You can add more regions as needed on the spreadsheet.
When you find a data source (eg. national poverty level), you might need to be _disaggregate_, or separate the data from large-scale (eg. national) to small-scale (eg. provincial). You could contact the agency that publishes the data and see if they have gathered small-scale data. If that's not possible, there are means (eg. GIS) to separate the large-scale data to local scales, although that could reduce the accuracy of the data.
_Pressures & Resilience_ are an important yet often overlooked component of the OHI. _Pressures_ are factors that negatively affect the status of each goal/subgoal, while _Resilience_ are factors that will improve the status or negate the impacts of the pressures. They will be used to predict the _Future State_. You will need to quantify each factor later on in the assessment. But at this stage, we suggest you brainstorm and jog down as many ideas as you could think of first.
### Examples
Now we have run through the basic theories behind each column, let's walk through a few examples together. They are on the next tabs of the spreadsheets (Example 1 - 4).
**Example 1 - (Model &) Variables**
![](https://docs.google.com/drawings/d/e/2PACX-1vSY6WtpVw6VzNfSBu_Mafu5JyTuSnIWh2yVdMi5jrC7xT4GJb7b6OmatsZoWSPGFyvSDnqDODVFsF5-/pub?w=960&h=240)
Models are relationships among variables. Before gathering any data, think about ideal variables that could represent the goal. How could each variable be relevant/representative of ocean health?
In this example, [artisanal fishing opportunities](http://ohi-science.org/goals/#artisanal-fishing-opportunities) measure whether people who need to fish on a small, local scale have the opportunity to do so. Thus the variables need to reflect the _need_ for artisanal fishing, how easy or hard it is for fishermen to _access_ ocean resources when they need them, and the _sustainability_ of harvest of all fish stocks that fishermen use.
_The number of artisanal fishermen_ and/or _the number of people below poverty line_ could reflect the need for artisanal fishing.
_Fish stock health_ can tell the sustainability of fishing in the region.
_Catch_ might be unsuitable, because it reflects the amount fished in the past, not the sustainable capacity for artisanal fishing. The fish stocks might have been over- or under- fished in the past.
_Gas price_ and _number of ports_ could limit the fisherman's ability to access ocean resources.
As you might have noticed, to the right side of the sheet, there is a column for **notes**. In this case for Artisanal Fishing Opportunities (AO), Fish catch is deemed not suitable for the reasons stated above. _It's just as important to document the rationales for excluding a data source as it is to record why you included one._
**Example 2 - Data**
![](https://docs.google.com/drawings/d/e/2PACX-1vTLmLMCMJMWzYW05Zhorwya3a1nB5uIZZ5vhHY0sjv6h-Cs4HSJCCm7tK_EBbv4o-Rygd3FRx21pIkS/pub?w=960&h=240)
These few columns detail the data you find for each of the variables you decided suitable for your model.
Are data available? For fish stock health, there is no readily accessible data sets. So this variable can't be used in your study. But other variables seem to have data.
Are they from a credible source? The quality of your assessment depends on the quality of your data. Government agencies (eg. Energy Information Administration), or research institutes tend to be more credible.
How many years does it cover? As OHI measures _trend_ for each goal, the trend is derived from ocean health status over time, ideally 5 years. In this case, there is sufficient data (1950-2016).
Also, what's the spatial resolution of this source? Spatial resolution should be compatible with the geopolitical boundaries of the regions in your study area. If the source only provides a national average, but the assessment requires provincial-level data, you need to look further.
**Example 3 - Spatial resolution**
![](https://docs.google.com/drawings/d/e/2PACX-1vTlPhVKT-a9rJUkWOof6o2OIGudmQ-IMjeYr16QcIhwmvheZTGwjlxUeO9OszaBXEREUQU2B970JKZd/pub?w=960&h=240)
As mentioned above, data should be disaggregated to the regional level that you determine to be the optimal spatial boundaries of your study.
In the data exploration process, however, you might find that the geopolitical boundaries you initially thought to be ideal are not supported by data. For example, if region 1 and 2 do not have independent data - that data can only be disaggregated from national level data, and region 1 and 2 may end up with non-distinguishable data, you might consider combining region 1 and 2 into the same region for your assessment.
![](https://docs.google.com/drawings/d/e/2PACX-1vQZjvRbXPAj_VKdJ4Nm5BfFSnivlr-d8K23qHY0HS9CBIcriifSVVIzll1GqpA3QD_lSMKINB_Drsg5/pub?w=960&h=336)
Sometimes there isn't data available, and sometimes data isn't relevant to a region, e.g. if there are no mangroves growing in Region 4. So it's important to have a system for distinguishing these two things. Here, we'll use _NA_ if it's not relevant; that's the syntax that `R` and the OHI Toolbox will use so we can be consistent. When data are not available but they should be, here we'll mark it with an _N_.
**Example 4 - Reference Point**
![](https://docs.google.com/drawings/d/e/2PACX-1vTR8bSMiV-QEovbuUOWOXBjeueRGXqIK1Z1kCa6DVprbWCo2HaPcl1ktjWP_mR8TWf-HHDvuZH64aJg/pub?w=960&h=240)
Reference points, or targets, are essential for rescaling and scoring. For example, if the target is to have 100 fishermen fishing, and the data shows only 50 fishermen are fishing, if the AO model depends only on fishermen, the AO score could be about 50 (there will also be pressures and resilience acting).
How should reference point(s) for the model variables be set? They could be set based on information from published policy targets (e.g. government-set targets to achieve), scientific studies (e.g. maximum sustainable yield for a fishery), etc. Sometimes there is no easy-to-find reference point from literature. In those cases, you'll need to discuss with experts and your team to set a reference point that makes the most sense to your area.
Often, reference points are temporal (eg. historical coverage of mangrove) or spatial (eg. the region that has the highest number of tourists). Let's look at a few examples:
_Gas prices_ fluctuates often. It's difficult to set a reference point where it's optimal. In this case, we set the reference point to be a spatial one. It is 110% of the lowest gas price across all regions in a given year.
_Number of artisanal fishermen_ might be dwindling over time, and we might want to preserve the number of fishermen in a region. So we set the reference point to be the highest number of fishermen in a region within the past 5 years.
_Number of people below poverty line_ often is a government set standard to achieve, so is _the number of ports_.
<!--- new info --->
**Example 5 - Pressures & Resilience**
As mentioned above, at this stage of your assessment, it's important to start thinking about pressures and resilience - factors that contribute to long-term outcomes, or the future state, of each goal or sub-goal. Think broadly. Here are a few examples:
- What pressures make it harder for small fishermen to get access to the stocks? For example, high gas price, poor fisheries stock health, few ports where fishermen could access the sea, etc.
- What regulations are there to directly increase access and counter the pressure? For example, loan programs for fishermen, stock improvement measures, etc.
- What regional or national policies that might indirectly improve the status of this goal? For example, policies to move people out of poverty could decrease the need of artisanal fishing, and clean water laws that improve water quality.
- How the status of other goals affect this goal? For example, severe water pollution could affect the fisheries.
## Chapter Recap
Now we have walked through how to utilize this planner to orient your team around the goals, from thinking about variables and models, to gathering data sources within spatial boundaries, and to brain storming pressures and resilience factors. The planner serves as a platform to record and share notes, and collaborate with your colleagues during the iterative process of an OHI assessment.
You might find that the data of one goal is ready to be processed and calculated with the OHI Toolbox, while your team still tries to explore options for another goal. It is quite alright to move ahead with some goals before others. Or you might find yourself needing to eliminate some data and add others after you further explored the data you found earlier. We hope this Planner could provide a home base for your data gathering and exploration. You can come back to this sheet, and modify it to your own needs anytime during the assessment.