Skip to content

DSSG-EUROPE/wef_oceans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Fishing Risk Framework from Satellites and Ocean Data

This proof-of-concept system assess vessels in a risk framework considering multiple factors to suggest the likelihood that a vessel has been engaging in illegal, unregulated, or unreported (IUU) fishing. The framework combines automatic identification system (AIS) tracking data with satellite imagery in construction of a risk framework. The framework combines several indicators including: the likelihood that a vessel has previously fished in a marine protected area (MPA) or exclusive economic zone (EEZ), and the intermittency of the vessel's AIS signal.

Vessel risk indicators may be considered individually, or weighted according to the users interest, or combined into a unified vessel risk score. This information is displayed in a front end web application, which gives governments, NGOs, retailers, and enforcement agencies the information to distinguish responsible, legitimate vessels from vessels doing IUU fishing. For example, this could be used by retailers to check the risk score of vessels that supply their tuna, or to guide enforcement agencies in choosing which areas to patrol.

How to run the pipeline

Running this pipeline requires a PostgreSQL database, Anaconda Python 3.4, and R (3.4.1). Pre-processing, feature generation, and modelling were performed in Python, with risk indicators created in PSQL, and the web application made in R Shiny. A separate pipeline to run the intersection between AIS tracking data and satellite imagery can be viewed here. Instructions to run the RShiny app are here.

Requirements

Before running the pipeline the following commands should be executed:

  • Create database credential files: auth/db_credentials (see auth/db_credentials_example), /auth/database_alchemy.ini (see auth/database_alchemy.dummy) and /auth/database_psycopg2.ini (see auth/database_psycopg2.dummy)
  • Define environmental variables: source environment_variables
  • Create conda environment for pipeline: conda env create -f envs/development.yml

1. Download shape files for distance feature generation

This will download shape files of coastlines and locations of ports, to be used for vessel distance calculations in the preprocessing and feature generation steps.

python src/features/ais_distance_calculations.py

2. Data cleaning

This PostgreSQL command removes nulls, unix timestamps, and coordinates beyond the range for positional and then static data.

psql -f ./sql_scripts/ais_data_cleaning.sql

3. Create schema for modelling

psql -c 'CREATE SCHEMA IF NOT EXISTS ais_is_fishing_model;'

4. Pre-processing and feature generation

This script removes duplicate data and null values and generates additional features, including distance to shore and distance to port for each vessel at each time point, and whether it is nighttime or daytime.

python src/models/is_fishing/preprocess_data.py

5. Train model to predict whether a vessel is fishing at each time point

This uses labelled training data to generate a model to predict whether a vessel is fishing at each time point. A random forest model with 450 trees was used and the model output was saved in the models directory.

python src/models/is_fishing/train_is_fishing.py

6. Predict if vessel is fishing

This code reads from the PostgreSQl database in chunks and predicts for each vessel at each time point the probability that it is fishing.

python src/models/is_fishing/predict_is_fishing.py

7. Create vessel aggregate features

This creates a count of the number of available rows in both AIS static and positional data for each MMSI.

psql -f ./sql_scripts/create_unique_vessel_register.sql

8. Create a score of the number of times vessel was in marine protected areas over a given time period

First, running bash ./sql_scripts/get_wdpa.sh will download a shapefile with all the World Protected Areas, and create and upload the schema and the data to a PostgreSQL instance. Second, using the uploaded table, the marine_protected_areas_within.sql script will create a unique vessel score to account for the presence of vessels in MPA's.

psql -f ./sql_scripts/marine_protected_areas_within.sql

9. Generate vessel risk indicators

Based on the existing tables, aggregate vessel MMSI indicators are created in this script:

psql -f ./sql_scripts/component_generator.sql

Collaborations

Data providers

Authors

This project was conducted as part of Data Science for Social Good (DSSG) Europe 2017 fellowship, further details of the twelve week summer fellowship can be found here: https://dssg.uchicago.edu/europe/

Data science fellows: Iván Higuera Mendieta, Shubham Tomar, and William Grimes

Project manager: Paul van der Boor

Technical mentor: Jane Zanzig

Acknowledgments

The authors would like to thank Euro Beinat and Nishan Degnarain for having the vision to pursue a data science project for detection of illegal, unreported, and unregulated fishing vessels. Further our weekly calls with Nishan Degnarain, and Steven Adler were instrumental in guiding this project to success.

We also extend our thanks to the following for their input, and helpful discussions: Dan Hammer, Gregory Stone, Kristina Boerder, Kyle Brazil, Nathan Miller, and Paul Woods.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages