Skip to content

Data Sources

Jose Huerta edited this page May 30, 2022 · 37 revisions

This page lists the main sources we used to complete our project objectives (as described on the front page of this repository) and outlines how we used each one.

List of Sources

We can split our sources into two main sections:

  • Calculations
    • What sources we used to perform our calculations
  • Shapefiles
    • What sources we used for our shapefiles

Calculations

To calculate the type of damage to be expected after a specific earthquake magnitude, we used:

Shapefiles

The feature layers (points and shapes) shown on our map are shapefiles retrieved from:

The Fragility Function Viewer

The Fragility Function Viewer is a web-based user interface for the Cascadia Lifelines Program (CLiP) Fragility Function Database. This database consists of fragility functions (or fragility curves) which exhibit the conditional probability of failure for a component or a system. Basically, the "fragility" can be defined as "the quality of being easily broken or damaged".

Please refer to the Official User Guide for more information about the Fragility Function Viewer.

Fragility Database

The data contained in the Fragility Database is the foundation for our project's calculations. To view this data in a csv or Excel format, find it in our Fragility Database Google Drive folder.

image
Figure 1: The Fragility Database

Note: The Fragility Database contains data from regions around the world. For the scope of this project, we only focused on the regions that contained the USA.

Fragility Functions

Each row in the Fragility Database is a unique fragility function. The fields and their values describe what kind of fragility function it is, and how it can be used. The following is a screenshot taken from "Defining Appropriate Fragility Functions for Oregon", the official report for the Fragility Database, which highlights the fields that were essential to our calculations:

image
Figure 2: Fragility Database fields

Fragility Probabilistic Distribution

The fragility probabilistic distribution field contains the type of probability distribution each fragility function is. A probability distribution is a mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment[1]. In our case, the experiment is a disaster event, i.e., an earthquake, and the outcomes are the possible damage states an infrastructure will sustain, i.e., its fragility.

There are four different types of fragility probabilistic distribution functions in the Fragility Database:

  • Normal
  • Lognormal
  • Discrete
  • Other

These functions return an exceedance probability (EP) value. These values determine the probability a specific damage state has of occurring in the event of a disaster. These values can then be a part of a graph which result in a curve, hence the term "fragility curve".

image
Figure 3: Lognormal fragility curve graph for damage states slight, moderate, extensive, and complete

Normal

For our normal fragility probabilistic distribution functions, the following Python function is used:

# Returns a floating-point number
def normal_cdf(IM, med, std) -> float:
    return (1 + math.erf((IM - float(med)) / math.sqrt(2) / float(std))) / 2

This function comes from "Probability Distributions with Python: Discrete & Continuous", a Medium post by Paul Apivat that goes over distribution functions.

Lognormal

For our lognormal fragility probabilistic distribution functions, the following Python function is used:

# Returns a floating-point number
def lognormal_cdf(IM, med, std) -> float:
    return lognorm.cdf(IM, float(std), scale = exp(log(float(med))))

This function uses the lognorm function from the scipy.stats.lognorm Scipy API reference page.

Discrete

For our discrete fragility probabilistic distribution functions, the following Python function is used:

# Returns a floating-point number
def discrete(IM, med, std) -> float:
    # First, convert med and std to individual arrays containing floats
    x = [float(num) for num in med.split()]
    y = [float(num) for num in std.split()]

    # If the IM value we choose is less than the first element in the x array (which should be the smallest one by default),
    # then linear interpolation will fail. 
    # We can get around this by defaulting the IM value to that first element. 
    if IM < x[0]:
        IM = x[0]

    # Use interpolate function to take in float arrays and develop a function.
    # The fill_value is set to “extrapolate”, which means we can calculate points outside the data range.
    linearInterpolation = interpolate.interp1d(x, y, fill_value='extrapolate')

    # Utilize this interpolation function with the IM as an argument to calculate and return the EP value 
    return linearInterpolation(IM)

For discrete fragility probabilistic distribution functions in the Fragility Database, the med and std parameters will be strings consisting of different values increasing linearly, e.g.:

med = "0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95 1.05 1.25"
std = "0.083 0.125 0.177 0.266 0.369 0.479 0.599 0.698 0.786 0.864 0.937"

As opposed to our normal and lognormal functions, our EP values are already known for our discrete fragility functions. They are the values in the std parameter. As for the med parameter, it will contain the known IM values. Both these parameters will contain the same number of values and increase linearly. Placing these values on a graph with the std EP values on the Y-axis and the med IM values on the X-axis, we get our fragility curve.

image
Figure 4: Discrete fragility curve graph for damage state minor, and a point selected

However, what if we want to calculate an EP value with an IM value that is not in either of these parameters? In this case, the interpolate.interp1d function from the scipy.interpolate.interp1d Scipy API reference page lets us calculate the EP values we don't know using linear interpolation.

Other

For our other fragility probabilistic distribution functions, the following Python function is used:

def polynomial(IM, polynomial, IMDesc, D = 1) -> float:
    # Cast D parameter to string and wrap it in parentheses.
    # Some formulas use log, therefore, we want to make sure the D parameter is wrapped in parentheses.
    D = "(" + str(D) + ")"

    # Store the new string with the updated IM value in a temporary variable
    temp = str(polynomial).replace(IMDesc, str(IM)).replace("D", D).replace("^", "**")
    # Evaluate and return the calculated value.
    return eval(temp) 

In the Fragility Database, any fragility function that is not normal, lognormal, or discrete, is a polynomial function. These functions already exist in the Fragility Database as strings under the Fragility_polynomial field.

image
Figure 5: Fragility_polynomial field

Polynomial fragility functions were unused in the scope of this project due to the fact that they only applied to buried pipelines and we didn't have any shapefiles for them. However, the Python function is set up to work in the event they are added to this project.

Basically, We take the polynomial value (which is a string) as a parameter. It will have one or both of the following:

  • IM Description (e.g., PGA, PGV, etc.)
  • D (Buried pipeline diameter)

What we do is, using the .replace function, replace these variables with the actual values we want to plug into the polynomial. Then, we use the eval function to evaluate this string as Python code and then we get the value we want.

Polynomial functions will not return an EP value. Instead, the value returned will be a repair rate. These are located in the Fragility_description field and there will be two different kinds:

  • Repairs/1000ft
  • Repairs/Km
image
Figure 6: Polynomial line graph showing Repairs/1000ft repair rate

Parameters

Each fragility probabilistic distribution functions takes in one or all of the following three parameters:

  • IM (Intensity Measure)
  • med (Median)
  • std (Standard Deviation)

IM (Intensity Measure)

In the Fragility Database, there are six different intensity measures:

  • PGA (Peak Ground Acceleration)
  • PGD (Peak Ground Displacement)
  • PGV (Peak Ground Velocity)
  • Return Period
  • Sa (Spectral Acceleration)
  • Wind Speed

While our Python functions are set up to accept any intensity measure value, for the scope of this project, we only had access to PGA values.

PGA (Peak Ground Acceleration)

During an earthquake, the ground shakes and experiences acceleration. This acceleration is not a measure of the total energy (magnitude, or size) of an earthquake, but rather of how hard the earth shakes at a given geographic point. Thus, peak ground acceleration is the maximum ground acceleration at these points[2].

To get access to the PGA values in Oregon, OHELP provides four different soil maps, each one for a specific earthquake magnitude.

image image
Figure 7: OHELP PGA Option Figure 8: 2D 8.1 earthquake magnitude soil map

As seen in Figure 8, each soil map will have a legend that illustrates what PGA values to expect in a specific region on the map.

Median & Standard Deviation

In the Fragility Database, the median and standard deviation values are already given. A single fragility function can have up to five different median and standard deviation values depending on the number of damage states it has.

Damage States

A damage state is the specific state of damage an infrastructure will be in after an earthquake of a specific magnitude. When we calculate an EP value, we are calculating the probability a specific damage state has of occurring. Therefore, given an earthquake magnitude, we can figure out what kind of damage is to be expected on different infrastructures, and the exact probability that kind of damage has of occurring.

In the Fragility Database, a single fragility function can have up to five different damage states. Each damage state is calculated using the correct fragility probabilistic distribution function using the PGA value at the infrastructure's coordinates as the IM parameter, and the damage state's median and standard deviation values as the med and std parameters, respectively. This means that if a single fragility function has five different damage states, five different EP values will be calculated for each damage state.

To avoid misinterpretation, we can follow an example on how to calculate the damage states for a single bridge.

Calculating Damage States Example

Let's say we have a bridge in Seaside, Oregon, and we want to calculate the damage states of it in the event of an earthquake magnitude of 9.0.

image
Figure 9: Satellite image of a bridge in Seaside, OR
  1. First, we need to know what fragility probabilistic distribution function to use. In the Fragility Database, there is a Fragility_no (fragility number) field that acts as an index for every fragility function in the database. For this specific bridge, we are using fragility function number 639.
    image

    Looking at this fragility function in the Fragility Function Viewer, we can see that we'll be using the lognormal fragility probabilistic distribution function.
    image

  2. Next, we prepare our parameters. First is our IM parameter. This value will not be present in the Fragility Database. This value will be the PGA value at this bridge's coordinates in the event of an earthquake magnitude of 9.0. Provided by OHELP, this value is 0.392129.

  3. Then, in the No_of_damage_state field, we see that there are four different damage states for this fragility function.
    image

    In the Fragility Database, these damage states and their respective median and standard deviation values will be separated into different columns. For this fragility function, Damage_state_1 is the Slight damage state, Damage_state_2 is the Moderate damage state, etc.
    image

  4. Now we have everything we need to plug into our Python lognormal function. We can visualize our parameters and values in a table (our PGA value does not change because our bridge's coordinates do not change):

    Damage State med std IM (PGA)
    Slight 0.286 0.132 0.392129
    Moderate 0.537 0.212 0.392129
    Extensive 0.71 0.312 0.392129
    Complete 1.167 0.622 0.392129
  5. Lastly, we use our lognormal Python function to calculate and print the EP value for each damage state.

    # --------------------------------------------------------------------
    # Lognormal Fragility Probabilistic Distribution Function 
    # --------------------------------------------------------------------
    def lognormal_cdf(IM, med, std) -> float:
        return lognorm.cdf(IM, float(std), scale = exp(log(float(med))))
    
    # PGA value at bridge's coordinates in the event of an earthquake
    # magnitude of 9.0.
    IM = 0.392129
    
    # Calculate Slight Damage
    print(lognormal_cdf(IM, 0.286, 0.132))
    
    # Calculate Moderate Damage
    print(lognormal_cdf(IM, 0.537, 0.212))
    
    # Calculate Extensive Damage
    print(lognormal_cdf(IM, 0.71, 0.132))
    
    # Calculate Complete Damage
    print(lognormal_cdf(IM, 1.167, 0.622))

    Running these functions, we end up with the following EP values:

    Damage State EP
    Slight 0.9915965
    Moderate 0.0690302
    Extensive 0.0000034
    Complete 0.0397686

    Each EP value is the probability of that specific damage occurring.

    Thus, in the event of an earthquake magnitude of 9.0, this specific bridge in Seaside, OR has a:

    • 99% chance of being slightly damaged,
    • 6% chance of being moderately damaged,
    • 0.00034% chance of being extensively damaged,
    • 3% chance of being completely damaged.

OHELP

The Oregon Hazard Explorer for Lifelines Program (OHELP) is a web-GIS tool to aid engineers in acquiring most recent available seismic data and assessing earthquake hazards in Oregon[3].

For our project, we used OHELP to:

  • Estimate PGA values for other infrastructures
  • Retrieve shapefiles for bridges in Oregon

Estimating PGA Values

OHELP provides a Bridge Damage Assessment tool that lets us know what kind of damage to expect on bridges after an earthquake of a specific magnitude.

image image
Figure 10: OHELP Bridge Damage Assessment Option Figure 11: Damaged bridges in Seaside, OR after earthquake magnitude 9.0

As shown in Figures 10 and 11, in the event of an earthquake magnitude of 9.0, we can expect some bridges in Seaside, OR to be slightly damaged and extensively damaged.

Furthermore, selecting one of these points, we can view its attributes which include the already calculated EP values for each damage state and the PGA value.

image
Figure 12: Bridge attributes showing EP and PGA values

However, OHELP does not provide this damage assessment for other types of infrastructures. Therefore, it will not have any PGA (or any other intensity measure) data for other infrastructures. This is a problem because in order to calculate damage states and their corresponding EP values, every infrastructure will need to have some kind of intensity measure value as a part of its attributes.

We solved this by getting access to the bridges from OHELP. Every bridge has their own PGA values specific to its geographic point. In order to estimate the PGA values for other infrastructures, we simply calculated the geographic distance from the infrastructure to the nearest bridge. Then, we simply transferred that bridge's PGA values to the infrastructure's attributes. While these other infrastructures won't have 100% accurate PGA values, this process can be done programmatically. As a result, we can both save a lot of time and have accurately enough PGA values.

To better understand this problem and its solution, we can follow an overview of the process:

Objective

We want to use fragility functions in the Fragility Database to calculate what kind of damage is to be expected on specific infrastructures in the event of an earthquake.

Problem

In order to calculate an infrastructure's damage states and their EP values, we need to know the infrastructure's intensity measure at its geographic point. This intensity measure can be any of the six intensity measures in the Fragility Database. However, the Fragility Database does not specify which intensity measure to use, nor its value. Thus, a damage state cannot be calculated.

Thought Process

OHELP provides a User Manual, where towards the end it has a tutorial on how to connect to its data server, thus giving access to all of its shapefiles and their attribute data.

Connecting to this server using ArcGIS Pro and accessing the bridges, we can see that, for every bridge, there is a PGA value for specific earthquake magnitudes.

image
Figure 13: ArcGIS Pro attribute table with PGA values highlighted

Figure 13 shows the attribute table for feature layer M90_09_Hazus_Bridges_Fragility, where:

  • M90 is an abbreviation for Magnitude 9.0,
  • 09 is the index number for this feature layer on the OHELP server,
  • Hazus_Bridges_Fragility is the type of Bridge Damage Assessment option we chose as shown in Figure 10.

The PGA values are highlighted and specific to not only the geographic point for each bridge, but for the chosen earthquake magnitude as well (in this case, 9.0).

From the OHELP server, we can see that there is a feature layer for each earthquake magnitude available.

image
Figure 14: M81-M90 attribute table tabs

There is only data for four different earthquake magnitudes:

  • 8.1
  • 8.4
  • 8.7
  • 9.0

Therefore, there will be four different PGA values for every bridge, one for each earthquake magnitude. We can merge these four different attribute tables to have all four PGA values in only one attribute table.

image
Figure 15: Merged bridge attribute table with all four PGA fields

The last step in this thought process was the realization that we can simply find the shortest geographical distance between any infrastructure and the nearest bridge. Then, transfer the bridge's PGA values to the infrastructure.

Solution

To actually transfer the PGA values from the Bridges attribute table (as shown in Figure 15) to any infrastructure, we ended up creating a copy of the merged table, cutting out the fields we didn't need and exported it as an Excel table in our PGA Values folder located in this repository.

image
Figure 16: BridgePGAs Excel table

Then, using the read_excel function from the pandas library, we can read the BridgePGAs Excel table within a Python script that also allows us to perform the shortest distance calculation and transfer the right PGA values.

For an explanation on how the calculations are performed, visit the PreparationForDisasterImpact tool page.

Shapefiles

For this project, the points and shapes seen on the map are from the following sources:

OHELP

As stated in the thought process of estimating PGA values, we used the OHELP server to get access to the bridges in Oregon.

image
Figure 17: Bridges in Seaside, OR from OHELP in ArcGIS Pro

Geofabrik

Geofabrik is a service that provides geodata from around the world, to the world. This geodata comes from OpenStreetMap (OSM), which is a map of the world, similar to Google Maps, but with data added and edited by the general public.

For our project, we downloaded OSM data for the state of Oregon.

image
Figure 18: OSM data in ArcGIS Pro

As shown in Figure 18, OSM provides a large variety of data (the icons and colors were added in ArcGIS Pro), ranging from points and shapes of specific buildings like schools and gas stations, and even different land use shapes like parks and industrial areas.

Geofabrik also provides an extremely useful guide on how its data is structured.

ODOT

The Oregon Department of Transportation (ODOT) provides a variety of maps and GIS products to meet the needs of statewide transportation planning, infrastructure, and engineering.

For our project, we used the or_trans_network_public.gdb file from ODOT's GIS Data Transportation Network section. This shapefile provides the roads of Oregon.

image
Figure 19: ODOT roads as lines with arrows

ArcGIS Online

To view all of this data in a map using ArcGIS Online, visit our team's GADEP Oregon Coast Data page.

One thing to note is that the data on our ArcGIS Online is different than the raw data provided by the sources listed above. For example:

  • Some of the OSM data was cut out due it being too large,
  • Our version of the OSM data will contain a County column that specifies which Oregon county each asset belongs to,
  • Our ODOT roads will contain extra fields. A spatial join analysis was performed to join OSM's version of Oregon's roads, resulting in our ODOT feature layer having fields like maxspeed (which denotes the max speed for a particular road), and oneway (which denotes whether or not a road is a one way).

Citations

  1. Probability distribution
  2. Peak ground acceleration
  3. Cascadia Lifelines Program (CLiP)