-
Notifications
You must be signed in to change notification settings - Fork 1
Data Sources
This page lists the main sources we used to complete our project objectives (as described on the front page of this repository) and outlines how we used each one.
We can split our sources into two main sections:
- Calculations
- What sources we used to perform our calculations
- Shapefiles
- What sources we used for our shapefiles
To calculate the type of damage to be expected after a specific earthquake magnitude, we used:
The feature layers (points and shapes) shown on our map are shapefiles retrieved from:
The Fragility Function Viewer is a web-based user interface for the Cascadia Lifelines Program (CLiP) Fragility Function Database. This database consists of fragility functions (or fragility curves) which exhibit the conditional probability of failure for a component or a system. Basically, the "fragility" can be defined as "the quality of being easily broken or damaged".
Please refer to the Official User Guide for more information about the Fragility Function Viewer.
The data contained in the Fragility Database is the foundation for our project's calculations. To view this data in a csv or Excel format, find it in our Fragility Database Google Drive folder.
Figure 1: The Fragility Database |
Note: The Fragility Database contains data from regions around the world. For the scope of this project, we only focused on the regions that contained the USA.
Each row in the Fragility Database is a unique fragility function. The fields and their values describe what kind of fragility function it is, and how it can be used. The following is a screenshot taken from "Defining Appropriate Fragility Functions for Oregon", the official report for the Fragility Database, which highlights the fields that were essential to our calculations:
Figure 2: Fragility Database fields |
The fragility probabilistic distribution field contains the type of probability distribution each fragility function is. A probability distribution is a mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment[1]. In our case, the experiment is a disaster event, i.e., an earthquake, and the outcomes are the possible damage states an infrastructure will sustain, i.e., its fragility.
There are four different types of fragility probabilistic distribution functions in the Fragility Database:
- Normal
- Lognormal
- Discrete
- Other
These functions return an exceedance probability (EP) value. These values determine the probability a specific damage state has of occurring in the event of a disaster. These values can then be a part of a graph which result in a curve, hence the term "fragility curve".
Figure 3: Lognormal fragility curve graph for damage states slight, moderate, extensive, and complete |
For our normal fragility probabilistic distribution functions, the following Python function is used:
# Returns a floating-point number
def normal_cdf(IM, med, std) -> float:
return (1 + math.erf((IM - float(med)) / math.sqrt(2) / float(std))) / 2
This function comes from "Probability Distributions with Python: Discrete & Continuous", a Medium post by Paul Apivat that goes over distribution functions.
For our lognormal fragility probabilistic distribution functions, the following Python function is used:
# Returns a floating-point number
def lognormal_cdf(IM, med, std) -> float:
return lognorm.cdf(IM, float(std), scale = exp(log(float(med))))
This function uses the lognorm
function from the scipy.stats.lognorm Scipy API reference page.
For our discrete fragility probabilistic distribution functions, the following Python function is used:
# Returns a floating-point number
def discrete(IM, med, std) -> float:
# First, convert med and std to individual arrays containing floats
x = [float(num) for num in med.split()]
y = [float(num) for num in std.split()]
# If the IM value we choose is less than the first element in the x array (which should be the smallest one by default),
# then linear interpolation will fail.
# We can get around this by defaulting the IM value to that first element.
if IM < x[0]:
IM = x[0]
# Use interpolate function to take in float arrays and develop a function.
# The fill_value is set to “extrapolate”, which means we can calculate points outside the data range.
linearInterpolation = interpolate.interp1d(x, y, fill_value='extrapolate')
# Utilize this interpolation function with the IM as an argument to calculate and return the EP value
return linearInterpolation(IM)
For discrete fragility probabilistic distribution functions in the Fragility Database, the med
and std
parameters will be strings consisting of different values increasing linearly, e.g.:
med = "0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95 1.05 1.25"
std = "0.083 0.125 0.177 0.266 0.369 0.479 0.599 0.698 0.786 0.864 0.937"
As opposed to our normal and lognormal functions, our EP values are already known for our discrete fragility functions. They are the values in the std
parameter. As for the med
parameter, it will contain the known IM
values. Both these parameters will contain the same number of values and increase linearly. Placing these values on a graph with the std
EP values on the Y-axis and the med
IM values on the X-axis, we get our fragility curve.
Figure 4: Discrete fragility curve graph for damage state minor, and a point selected |
However, what if we want to calculate an EP value with an IM value that is not in either of these parameters? In this case, the interpolate.interp1d
function from the scipy.interpolate.interp1d Scipy API reference page lets us calculate the EP values we don't know using linear interpolation.
For our other fragility probabilistic distribution functions, the following Python function is used:
def polynomial(IM, polynomial, IMDesc, D = 1) -> float:
# Cast D parameter to string and wrap it in parentheses.
# Some formulas use log, therefore, we want to make sure the D parameter is wrapped in parentheses.
D = "(" + str(D) + ")"
# Store the new string with the updated IM value in a temporary variable
temp = str(polynomial).replace(IMDesc, str(IM)).replace("D", D).replace("^", "**")
# Evaluate and return the calculated value.
return eval(temp)
In the Fragility Database, any fragility function that is not normal, lognormal, or discrete, is a polynomial function. These functions already exist in the Fragility Database as strings under the Fragility_polynomial
field.
Figure 5: Fragility_polynomial field |
Polynomial fragility functions were unused in the scope of this project due to the fact that they only applied to buried pipelines and we didn't have any shapefiles for them. However, the Python function is set up to work in the event they are added to this project.
Basically, We take the polynomial value (which is a string) as a parameter. It will have one or both of the following:
- IM Description (e.g.,
PGA
,PGV
, etc.) -
D
(Buried pipeline diameter)
What we do is, using the .replace
function, replace these variables with the actual values we want to plug into the polynomial. Then, we use the eval
function to evaluate this string as Python code and then we get the value we want.
Polynomial functions will not return an EP value. Instead, the value returned will be a repair rate. These are located in the Fragility_description
field and there will be two different kinds:
- Repairs/1000ft
- Repairs/Km
Figure 6: Polynomial line graph showing Repairs/1000ft repair rate |
Each fragility probabilistic distribution functions takes in one or all of the following three parameters:
-
IM
(Intensity Measure) -
med
(Median) -
std
(Standard Deviation)
In the Fragility Database, there are six different intensity measures:
- PGA (Peak Ground Acceleration)
- PGD (Peak Ground Displacement)
- PGV (Peak Ground Velocity)
- Return Period
- Sa (Spectral Acceleration)
- Wind Speed
While our Python functions are set up to accept any intensity measure value, for the scope of this project, we only had access to PGA values.
During an earthquake, the ground shakes and experiences acceleration. This acceleration is not a measure of the total energy (magnitude, or size) of an earthquake, but rather of how hard the earth shakes at a given geographic point. Thus, peak ground acceleration is the maximum ground acceleration at these points[2].
To get access to the PGA values in Oregon, OHELP provides four different soil maps, each one for a specific earthquake magnitude.
Figure 7: OHELP PGA Option | Figure 8: 2D 8.1 earthquake magnitude soil map |
As seen in Figure 8, each soil map will have a legend that illustrates what PGA values to expect in a specific region on the map.
In the Fragility Database, the median and standard deviation values are already given. A single fragility function can have up to five different median and standard deviation values depending on the number of damage states it has.
A damage state is the specific state of damage an infrastructure will be in after an earthquake of a specific magnitude. When we calculate an EP value, we are calculating the probability a specific damage state has of occurring. Therefore, given an earthquake magnitude, we can figure out what kind of damage is to be expected on different infrastructures, and the exact probability that kind of damage has of occurring.
In the Fragility Database, a single fragility function can have up to five different damage states. Each damage state is calculated using the correct fragility probabilistic distribution function using the PGA value at the infrastructure's coordinates as the IM
parameter, and the damage state's median and standard deviation values as the med
and std
parameters, respectively. This means that if a single fragility function has five different damage states, five different EP values will be calculated for each damage state.
To avoid misinterpretation, we can follow an example on how to calculate the damage states for a single bridge.
Let's say we have a bridge in Seaside, Oregon, and we want to calculate the damage states of it in the event of an earthquake magnitude of 9.0.
Figure 9: Satellite image of a bridge in Seaside, OR |
-
First, we need to know what fragility probabilistic distribution function to use. In the Fragility Database, there is a
Fragility_no
(fragility number) field that acts as an index for every fragility function in the database. For this specific bridge, we are using fragility function number 639.
Looking at this fragility function in the Fragility Function Viewer, we can see that we'll be using the lognormal fragility probabilistic distribution function.
-
Next, we prepare our parameters. First is our
IM
parameter. This value will not be present in the Fragility Database. This value will be the PGA value at this bridge's coordinates in the event of an earthquake magnitude of 9.0. Provided by OHELP, this value is0.392129
. -
Then, in the
No_of_damage_state
field, we see that there are four different damage states for this fragility function.
In the Fragility Database, these damage states and their respective median and standard deviation values will be separated into different columns. For this fragility function,
Damage_state_1
is the Slight damage state,Damage_state_2
is the Moderate damage state, etc.
-
Now we have everything we need to plug into our Python lognormal function. We can visualize our parameters and values in a table (our PGA value does not change because our bridge's coordinates do not change):
Damage State med
std
IM
(PGA)Slight 0.286 0.132 0.392129 Moderate 0.537 0.212 0.392129 Extensive 0.71 0.312 0.392129 Complete 1.167 0.622 0.392129 -
Lastly, we use our lognormal Python function to calculate and print the EP value for each damage state.
# -------------------------------------------------------------------- # Lognormal Fragility Probabilistic Distribution Function # -------------------------------------------------------------------- def lognormal_cdf(IM, med, std) -> float: return lognorm.cdf(IM, float(std), scale = exp(log(float(med)))) # PGA value at bridge's coordinates in the event of an earthquake # magnitude of 9.0. IM = 0.392129 # Calculate Slight Damage print(lognormal_cdf(IM, 0.286, 0.132)) # Calculate Moderate Damage print(lognormal_cdf(IM, 0.537, 0.212)) # Calculate Extensive Damage print(lognormal_cdf(IM, 0.71, 0.132)) # Calculate Complete Damage print(lognormal_cdf(IM, 1.167, 0.622))
Running these functions, we end up with the following EP values:
Damage State EP Slight 0.9915965 Moderate 0.0690302 Extensive 0.0000034 Complete 0.0397686 Each EP value is the probability of that specific damage occurring.
Thus, in the event of an earthquake magnitude of 9.0, this specific bridge in Seaside, OR has a:
- 99% chance of being slightly damaged,
- 6% chance of being moderately damaged,
- 0.00034% chance of being extensively damaged,
- 3% chance of being completely damaged.
The Oregon Hazard Explorer for Lifelines Program (OHELP) is a web-GIS tool to aid engineers in acquiring most recent available seismic data and assessing earthquake hazards in Oregon[3].
For our project, we used OHELP to:
- Estimate PGA values for other infrastructures
- Retrieve shapefiles for bridges in Oregon
OHELP provides a Bridge Damage Assessment tool that lets us know what kind of damage to expect on bridges after an earthquake of a specific magnitude.
Figure 10: OHELP Bridge Damage Assessment Option | Figure 11: Damaged bridges in Seaside, OR after earthquake magnitude 9.0 |
As shown in Figures 10 and 11, in the event of an earthquake magnitude of 9.0, we can expect some bridges in Seaside, OR to be slightly damaged and extensively damaged.
Furthermore, selecting one of these points, we can view its attributes which include the already calculated EP values for each damage state and the PGA value.
Figure 12: Bridge attributes showing EP and PGA values |
However, OHELP does not provide this damage assessment for other types of infrastructures. Therefore, it will not have any PGA (or any other intensity measure) data for other infrastructures. This is a problem because in order to calculate damage states and their corresponding EP values, every infrastructure will need to have some kind of intensity measure value as a part of its attributes.
We solved this by getting access to the bridges from OHELP. Every bridge has their own PGA values specific to its geographic point. In order to estimate the PGA values for other infrastructures, we simply calculated the geographic distance from the infrastructure to the nearest bridge. Then, we simply transferred that bridge's PGA values to the infrastructure's attributes. While these other infrastructures won't have 100% accurate PGA values, this process can be done programmatically. As a result, we can both save a lot of time and have accurately enough PGA values.
To better understand this problem and its solution, we can follow an overview of the process:
We want to use fragility functions in the Fragility Database to calculate what kind of damage is to be expected on specific infrastructures in the event of an earthquake.
In order to calculate an infrastructure's damage states and their EP values, we need to know the infrastructure's intensity measure at its geographic point. This intensity measure can be any of the six intensity measures in the Fragility Database. However, the Fragility Database does not specify which intensity measure to use, nor its value. Thus, a damage state cannot be calculated.
OHELP provides a User Manual, where towards the end it has a tutorial on how to connect to its data server, thus giving access to all of its shapefiles and their attribute data.
Connecting to this server using ArcGIS Pro and accessing the bridges, we can see that, for every bridge, there is a PGA value for specific earthquake magnitudes.
Figure 13: ArcGIS Pro attribute table with PGA values highlighted |
Figure 13 shows the attribute table for feature layer M90_09_Hazus_Bridges_Fragility
, where:
-
M90
is an abbreviation for Magnitude 9.0, -
09
is the index number for this feature layer on the OHELP server, -
Hazus_Bridges_Fragility
is the type of Bridge Damage Assessment option we chose as shown in Figure 10.
The PGA values are highlighted and specific to not only the geographic point for each bridge, but for the chosen earthquake magnitude as well (in this case, 9.0).
From the OHELP server, we can see that there is a feature layer for each earthquake magnitude available.
Figure 14: M81-M90 attribute table tabs |
There is only data for four different earthquake magnitudes:
- 8.1
- 8.4
- 8.7
- 9.0
Therefore, there will be four different PGA values for every bridge, one for each earthquake magnitude. We can merge these four different attribute tables to have all four PGA values in only one attribute table.
Figure 15: Merged bridge attribute table with all four PGA fields |
The last step in this thought process was the realization that we can simply find the shortest geographical distance between any infrastructure and the nearest bridge. Then, transfer the bridge's PGA values to the infrastructure.
To actually transfer the PGA values from the Bridges
attribute table (as shown in Figure 15) to any infrastructure, we ended up creating a copy of the merged table, cutting out the fields we didn't need and exported it as an Excel table in our PGA Values folder located in this repository.
Figure 16: BridgePGAs Excel table |
Then, using the read_excel
function from the pandas library, we can read the BridgePGAs
Excel table within a Python script that also allows us to perform the shortest distance calculation and transfer the right PGA values.
For an explanation on how the calculations are performed, visit the PreparationForDisasterImpact tool page.
For this project, the points and shapes seen on the map are from the following sources:
As stated in the thought process of estimating PGA values, we used the OHELP server to get access to the bridges in Oregon.
Figure 17: Bridges in Seaside, OR from OHELP in ArcGIS Pro |
Geofabrik is a service that provides geodata from around the world, to the world. This geodata comes from OpenStreetMap (OSM), which is a map of the world, similar to Google Maps, but with data added and edited by the general public.
For our project, we downloaded OSM data for the state of Oregon.
Figure 18: OSM data in ArcGIS Pro |
As shown in Figure 18, OSM provides a large variety of data (the icons and colors were added in ArcGIS Pro), ranging from points and shapes of specific buildings like schools and gas stations, and even different land use shapes like parks and industrial areas.
Geofabrik also provides an extremely useful guide on how its data is structured.
The Oregon Department of Transportation (ODOT) provides a variety of maps and GIS products to meet the needs of statewide transportation planning, infrastructure, and engineering.
For our project, we used the or_trans_network_public.gdb
file from ODOT's GIS Data Transportation Network section. This shapefile provides the roads of Oregon.
Figure 19: ODOT roads as lines with arrows |
To view all of this data in a map using ArcGIS Online, visit our team's GADEP Oregon Coast Data page.
One thing to note is that the data on our ArcGIS Online is different than the raw data provided by the sources listed above. For example:
- Some of the OSM data was cut out due it being too large,
- Our version of the OSM data will contain a
County
column that specifies which Oregon county each asset belongs to, - Our ODOT roads will contain extra fields. A spatial join analysis was performed to join OSM's version of Oregon's roads, resulting in our ODOT feature layer having fields like
maxspeed
(which denotes the max speed for a particular road), andoneway
(which denotes whether or not a road is a one way).