Skip to content

tomay/model-manager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Model-manager

Purpose

Rails app that uses a rake task to produce MaxEnt models from species occurrence data hosted on the REBIOMA data portal: http://data.rebioma.net

Model specification

One of the key features of the REBIOMA web portal is its support of species distribution models, created from taxonomically valid and quality reviewed occurrence data. Additional details are in this document. This is a working document that shows details of the first implementation of the modeling routines, and includes modeling features that have been implemented in REBIOMA, and features that have not yet been implemented. The latest version has been updated to create models from IPCC 4th assessment GCM data.

To summarize, species distribution models are produced only from valid, quality reviewed occurrence records. We use MaxEnt to produce continuous distribution models from the occurrence records and environmental data for several eras and scenarios: present (1950-2000), and future (2080-2100 under a2a and b2a future climate “story-lines”). In constructing training data, we correct for both sample and era bias. The models are trained, and results are subjected to statistical validation. Only models that pass this validation step are projected to the eras and scenarios listed above.

The main steps in the modeling process are as follows:

  1. Setup. From all occurrences, select species with at least 8 reliably reviewed, non-duplicate records. For each species occurrence, extract and save appropriate climate and forest cover values. Create background sample files for each species scenario consisting of 10,000 points, with the following constraints: Select 10,000 points randomly from within sampled cells (i.e. a “mask” of all sampled cells for all species) to correct for spatial sampling bias. Selecting background data in the same proportion per era as it was collected to correct for temporal sample bias.

  2. Model training. The next step is to run Maxent with replicates. We do not project the models at this stage. Run either 4 or 10 replicates to evaluate model depending on the total number of samples. If total samples are > 8 but < 20, run 4 replicates. If samples are >= 20, run 10 replicates. For every species, we now have two potential models, one including climate and forest cover variables, one with climate only.

  3. Model evaluation. For a given species, if the mean training AUC minus standard error (SE) > 0.5 (for the 4 or 10 models produced by cross validation), then we produce (“project”) the full models using all the samples.

  4. Produce full models. If the species passes validation, then produce the model using 100% of the data (no cross-validation). This step produces the lambdas that we will use in projection.

  5. Full set of models to project. For each model, select the correct “lambda” scenario, and project onto the matching environmental layers present (1950-2000) and future climate (2080-2100) for a2a and b2a scenarios.

This results in three projections (distribution models) for each modellable species:

  • Present climate and forest cover

  • Future under a2a scenario and predicted deforestation

  • Future under b2a scenario and predicted deforestation

Notes on the code here

Unfortunately this is not general purpose code, it is highly customized to our purposes and platform for production in this one particular application. Over time, we will try to make this more organized and generic so that it is more easily applicable elsewhere.

Installation

  1. install javascript runtime (e.g. nodejs):

    sudo apt-get install nodejs
  2. install rvm. See this guide:

    \curl -L https://get.rvm.io | bash -s stable --autolibs=3 --rails
    rvm install ruby-2.0.0-p0 # optional
    rvm install ruby-1.9.3 
    rvm --default use ruby-1.9.3
    gem install rails
  3. If all is well try:

    ruby -v # should see $ ruby 2.0.0p0 (2013-02-24 revision 39474) [i686-linux]
    rails -v # should see $ Rails 3.2.13
    
  4. Install git:

    sudo apt-get install git
  5. Change to your directory of choice and clone project:

    git clone https://github.com/tomay/model-manager.git
  6. Change into the “model-manager” directory, and install missing gems with ‘bundle install’

    cd model-manager
    bundle install
    

Configuration

  1. Edit config/database.yml for local dev and prod databases (local user and pass) and adapters (e.g. mysql or pg)

    cp config/database-template.yml config/database.yml
  2. Get environmental layers, save locally and unzip

    wget https://rebioma.googlecode.com/files/ascii_ipcc4.zip
  3. Get marine mask, save locally and unzip

    wget https://rebioma.googlecode.com/files/marine_mask.asc.zip
  4. Setup ascii links to environmental layers, use this file as a template

    wget https://rebioma.googlecode.com/files/setup_links.xlsx
  5. Download MaxEnt (currently using version 3.3.3e)

  6. Update lib/mm_lib/yml/properties.yml for local filenames and other run parameters

    cp lib/mm_lib/yml/properties-template.yml lib/mm_lib/yml/properties.yml

To run model-manager

From terminal

$ rake model_manager:main

An option over SSH with nohup

$ nohup rake model_manager:main & # avoid HUP signal if disconnected

Another option over SSH with screen

$ screen
$ rake model_manager:main
$ Press Ctrl-A then Ctrl-D # this detaches screen session but leaves rake running
$ screen-r # to resume
$ exit # to exit and end screen session

Utilities

To chown files, delete old output, and “copy” (via mount) output to tomcat, configure options in properties.yml, then run the following from terminal:

$ rake model_manager:copy_output

These commands require sudo and will prompt for su pw. Of course these same tasks can also be run from the command line with no need for ruby.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages