ARTIS HPC

This repository contains the instructions to create the ARTIS High Performance Computer (HPC) on Amazon Web Services (AWS) and run the ARTIS model. There are two scenarios for using this repository:

Setting up a new ARTIS HPC on AWS
- Likely able to follow instructions top to bottom
Running an Existing ARTIS HPC Setup
- Likely able to skip to Setup Local Python Environment and skip Setting Up New ARTIS HPC on AWS if the previous model run was executed from the same local machine.

All commands will be run in the terminal/command line and indicated with a "$" before the command or contained within a code block

Technologies Used
Assumptions
Installations
Setup Local Python Environment
AWS CLI Setup
Update ARTIS Model Scripts and Model Inputs
Setting Up New ARTIS HPC on AWS
Running Existing ARTIS HPC Setup
Combine ARTIS model outputs into CSVs
Download Results, Clean Up AWS and Docker Environments
Checks & Troubleshooting
Create AWS IAM user
Docker container artis-image details

Technologies Used

Terraform
- Creates all the AWS infrastructure needed for the ARTIS HPC.
- Destroys all AWS infrastructure for the ARTIS HPC after the ARTIS model has finished to save on unnecessary costs.
Docker
- Creates a docker image that our HPC jobs will use to run the ARTIS model code.
Python
- Uses the Docker and AWS Python (boto3) clients to:
  - Push all model inputs to AWS S3
  - Build docker image needed to run ARTIS model
  - Push docker image to AWS ECR
  - Submit jobs to AWS Batch
  - Download model outputs from AWS S3 bucket
R
- Main basis of the ARTIS model code
AWS (Amazon Web Services)
- IAM (Identity and Access Management)
  - Manages all AWS users and permissions
- S3 (Simple Storage Service)
  - Stores all model inputs and outputs outside of the docker container run in the VPC (Virtual Private Cloud)
- VPC (Virtual Private Cloud)
  - Isolated section of the AWS cloud where all AWS resources are run. It is a virtual network dedicated to your AWS account.
- ECR (Elastic Container Registry)
  - Stores the docker image artis-image that contains the environment, code, and inputs needed to run the ARTIS model
- EC2 (Elastic Compute Cloud)
  - cloud-based service that provides scalable virtual servers (referred to as instances) in the cloud. It runs ARTIS on a virtual machine.
- Batch (Batch Computing Service)
  - Manages and schedules the execution of jobs. It handles the provisioning of the necessary compute resources (like EC2 instances), queues jobs, and dispatches them to the appropriate resources for execution.
- CloudWatch
  - Monitors and logs all AWS resources. Logs are used to troubleshoot failed jobs.

AWS Batch Based Architecture

Content from AWS Documentation

Basic workflow steps:

User creates a job container (artis-image), uploads the container to the Amazon Elastic Container Registry, and creates a job definition to AWS Batch.
User submits jobs to a job queue in AWS Batch.
AWS Batch pulls the image from the container registry and processes the jobs in the queue
Input and output data from each job is stored in an S3 bucket (artis-s3-bucket)

Assumptions:

An AWS root user was created (To create an AWS root user visit )
AWS root user has created an admin user group with "AdministratorAccess" permissions.
AWS root user has created IAM users
AWS root user has add IAM users to admin group
AWS IAM users have their AWS AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY

To create an AWS IAM user follow the instructions here: Create AWS IAM user

Note: If have you created ANY AWS RESOURCES for ARTIS manually, not including ROOT and IAM users, please delete these before continuing.

Installations

Homebrew
AWS CLI
Terraform CLI
Python Installation
- Python packages
  - docker
  - boto3

Homebrew Installation

Note: If you already have Homebrew installed please still confirm by following step 3 below. Both instructions should run without an error message.

Install homebrew - run$

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Close existing terminal window where installation command was run and open a new terminal window
Confirm homebrew has been installed -
- Run $brew --version. No error message should appear.

If after homebrew installation you get a message stating brew command not found:

Edit zsh config file, run $vim ~/.zshrc
Type i to enter edit mode
Copy & paste this line into the file you opened:

export PATH=/opt/homebrew/bin:$PATH

Press Shift and :
Type wq
Press Enter
Source new config file, run $source ~/.zshrc

AWS CLI Installation

Following instructions from AWS

Note: If you already have AWS CLI installed please still confirm by following step 3 below. Both instructions should run without an error message.

The following instructions are for MacOS users:

Run $curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
Run $sudo installer -pkg AWSCLIV2.pkg -target /
Confirm AWS CLI has been installed:
1. Run $which aws
2. Run $aws --version

Terraform CLI Installation

Note: If you already have homebrew installed please confirm by running $brew --version, no error message should occur.

To install terraform on MacOS we will be using homebrew. If you do not have homebrew installed on your computer please follow the installation instructions here, before continuing.

Based on Terraform CLI installation instructions provided here.

Run $brew tap hashicorp/tap
Run $brew install hashicorp/tap/terraform
Run $brew update
Run $brew upgrade hashicorp/tap/terraform

If this has been unsuccessful you might need to install xcode command line tools, try:

Run terminal command: sudo xcode-select --install

Python Installation

install python3 on MacOS: Run $brew install python3
check python3 has been installed: Run $python3 --version
install pip (package installer for python): Run $sudo easy_install pip

Setup Local Python Environment

To prepare for running the ARTIS model on AWS we need to create a virtual environment to run the python scripts in. Note: Please make sure that your terminal is currently in the correct working directory for this project (should end in .../.../artis-hpc)

Run $pwd to confirm you are in the correct working directory
Run $python3 -m venv venv to create a virtual environment
Run $source venv/bin/activate to open virtual environment
Run $pip3 install -r requirements.txt to install all required python modules
Run $pip3 list to check that all python modules have been downloaded. Check that all modules in the requirements.txt file are included.

If an error occurs please follow these instructions:

Upgrade your version of pip, Run $pip install --upgrade pip
Install all required python modules, Run $pip3 install -r requirements.txt
If errors still occur install each python package in the requirements.txt file individually, Run $pip3 install [PACKAGE NAME] ie $pip3 install urllib3.

AWS CLI Setup

Run $export AWS_ACCESS_KEY=[YOUR_AWS_ACCESS_KEY]
- sets terminal environmental variable. Replace [YOUR_AWS_ACCESS_KEY] with your value
Run $export AWS_SECRET_ACCESS_KEY=[YOUR_AWS_SECRET_ACCESS_KEY]
- sets terminal environmental variable. Replace [AWS_SECRET_ACCESS_KEY] with your value
Run $export AWS_REGION=us-east-1
- sets terminal environmental variable
Run $aws configure set aws_access_key_id $AWS_ACCESS_KEY
- writes value to AWS credentials file (~/.aws/credentials)
Run $aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY
- writes value to AWS credentials file (~/.aws/credentials)
Run $aws configure set region $AWS_REGION
- writes value to AWS config file (~/.aws/config)

To check set values:

Run $echo $AWS_ACCESS_KEY to display the local environmental variable value set with the export command. Replace the variable name to check other values

Likewise, Run $aws configure get aws_access_key_id to print aws environment variable values stored in the AWS credentials file. Replace the variable name to check other values.

Update ARTIS model scripts and model inputs

You will need to transfer ARTIS code and input data (likely from Seafood-Globalization-Lab/artis-model repo) to your local artis-hpc project directory:

Copy 00-aws-hpc-setup.R, 02-artis-pipeline.R, and 03-combine-tables.R scripts to artis-hpc/data_s3_upload/ARTIS_model_code/
Run $export HS_VERSIONS="[HS VERSIONS YOU ARE RUNNING, NO SPACES]" to specify which HS versions to run
- i.e. $export HS_VERSIONS="02,07,12,17,96" or $export HS_VERSIONS="17"
Run $./create_pipeline_versions.sh to create a new version of 02-artis-pipeline.R and 00-aws-hpc-setup.R for every HS version specified to run in HS_VERSIONS in artis-hpc/data_s3_upload/ARTIS_model_code/
Copy the most up-to-date set of model_inputs to artis-hpc/data_s3_upload/ directory. Retain the folder name model_inputs
Copy the most up-to-date ARTIS R/ package folder to artis-hpc/data_s3_upload/ARTIS_model_code/
Copy the most up-to-date ARTIS R package NAMESPACE file to artis-hpc/data_s3_upload/ARTIS_model_code/
Copy the most up-to-date ARTIS R package DESCRIPTION file to artis-hpc/data_s3_upload/ARTIS_model_code/
Copy the most up-to-date .Renviron file to artis-hpc/data_s3_upload/ARTIS_model_code/ (-AM is this needed?)

If running on a new Apple chip arm64:

Copy arm64_venv_requirements.txt file from the root directory to the artis-hpc/docker_image_files_original/
Rename the file artis-hpc/docker_image_files_original/arm64_venv_requirements.txt to artis-hpc/docker_image_files_original/requirements.txt

Setting Up New ARTIS HPC on AWS

The initial_setup.py script automates the setup process for running the ARTIS model on AWS. It begins by configuring the environment based on the specified chip architecture, copying the appropriate Dockerfile to the project root, and embedding AWS credentials. The script then creates the necessary AWS infrastructure using Terraform, uploads model input files to the specified S3 bucket, and builds a Docker image using files from the docker_image_files_original/ directory. This image is uploaded to the AWS ECR repository. Finally, the script submits jobs to AWS Batch for model execution. In case of an error, the script automatically cleans up all created AWS resources.

Anytime there are edits or changes to the ARTIS model codebase there is no need to recreate the docker image, skip to Running an Existing ARTIS HPC Setup

Open Docker Desktop
Take note of any existing docker images and containers relating to other projects and
- Delete all docker containers relating to ARTIS,
- Delete all docker images relating to ARTIS.
Create AWS infrastructure, upload model inputs, and create new ARTIS docker image, Run:

python3 initial_setup.py -chip [YOUR CHIP INFRASTRUCTURE] -aws_access_key [YOUR AWS KEY] -aws_secret_key [YOUR AWS SECRET KEY] -s3 artis-s3-bucket -ecr artis-image

Details:

If you are using an Apple Silicone chip (M1, M2, M3, etc) your chip will be arm64, otherwise for intel chips it will be x86
If you have an existing docker image you would like to use include the -di [existing docker image name] flag.
Recommendation: the default options will create a docker image called artis-image, so if you want to use the previously created default docker image you would include -di artis-image.
Note: The AWS docker image repository and the docker image created with default options both have the name artis-image, however they are two different resources.

Example command:

Using credentials stored in local environmental variables set above
Using existing docker image artis-image with the latest tag)

python3 initial_setup.py -chip arm64 -aws_access_key $AWS_ACCESS_KEY -aws_secret_key $AWS_SECRET_ACCESS_KEY -s3 artis-s3-bucket -ecr artis-image -di artis-image:latest

Troubleshooting Tip: If terraform states that it created all resources however when you log into the AWS console to confirm cannot see them, they have most likely been created as part of another account. Run terraform destroy -auto-approve on the command line. Confirmed you have followed the AWS CLI set up instructions with the correct set of keys (AWS access key and AWS secret access key).

A successful and complete model run will then proceed to the next step Combine ARTIS model outputs into CSVs

Running Existing ARTIS HPC Setup

Note: These instructions are only applicable if:

all AWS infrastructure is created
docker image artis-image is built and
the only changes are to files within model_inputs/* or ARTIS_model_code/*.

Log onto AWS to check s3 bucket artis-s3-bucket for contents that need to be saved. Then permanently delete all contents of the artis-s3-bucket bucket.
Make sure to put all new scripts or model inputs into the relevant artis-hpc/data_s3_upload/ folders locally.
Run: $source venv/bin/activate to open python environment (make sure proper requirements were installed here)
Run: $python3 s3_upload.py to upload local contents of data_s3_upload/ to AWS S3 bucket artis-s3-bucket
Run: $python3 submit_artis_jobs.py to submit batch jobs on AWS.
- Loops through designated HS_VERSIONS to run corresponding shell scripts to source docker_image_artis_pkg_download.R and 02-artis-pipeline_[hs version].R

A successful and complete model run will then proceed to the next step Combine ARTIS model outputs into CSVs

Combine ARTIS model outputs into CSVs

Run $python3 submit_combine_tables_job.py

Download results, Clean up AWS and Docker environments

Run $python3 s3_download.py to download artis-s3-bucket contents to local artis-hpc/outputs_[RUN YYYY-MM-DD] directory
Open Docker Desktop app, 4. Delete all containers created 5. Optional to delete ARTIS images - can retain if planing to run the model again
Run $terraform destroy to destroy all AWS resources and dependencies created
- This is important to not be charged money for maintaining idle resources and storing GBs of data in S3 buckets
Run $deactivate to close python environment

Checks & Troubleshooting

Status of jobs submitted to AWS batch:

navigate to AWS in your browser and log in to your IAM account.
Use the search bar at the top of the page to search for "batch" and click on the service Batch result.

Under "job queue overview" you will be able to see job statuses and click on number to open details.

Investigate individual job status and details through filters (be sure to click "search")

Troubleshoot failed jobs:

Set "Filter type" to "Status" and "Filter value" to "FAILED" in AWS Batch > Jobs window above. Click "Search" button.
Identify and open relevant failed job by clicking on job name.
Inspect "Details" for Failed job, "Status Reason" is particularly helpful.
Click on "Log stream name" to open CloudWatch logs for the specific job. This displays the code output and error messages. - Note: The "AWS Batch > Jobs > your-job-name" image below shows a common error message ResourceInitializationError: unable to pull secrets or registry auth: [...] when there is an issue initializing the resources required by the AWS Batch job. This is most likely a temporary network issue and can be resolved by re-running the specific job (HS version).

Note: The image above shows a common error message when the model code is unable to find the correct file path.

Check CloudWatch logs for specific job:

Search for "cloudwatch" in search bar and click on the Service CloudWatch
In the left hand nav-bar click on "Logs"" then "Log groups" and next "/aws/batch/job"
Inspect "log streams" for (likely by "last event time") to identify and open correct log.
Inspect messages, output, and errors from running the model code

Check for all expected outputs in S3 bucket:

Navigate to the artis-s3-bucket in AWS S3.
Confirm that all expected outputs are present for submitting ARTIS model jobs. - The outputs folder should contain a snet/ subfolder that has each HS version specified in the HS_VERSIONS variable - Each HS version folder should contain the applicable years

The expected directory structure is as follows:

aws/artis-s3-bucket/
└── outputs/
    ├── cvxopt_snet/
    │   ├── HS[VERSION]/
    │   │   ├── [YEAR]/
    │   │   │   ├── [RUN YYYY-MM-DD]_all-country-est_[YEAR]_HS[VERSION].RDS
    │   │   │   ├── [RUN YYYY-MM-DD]_all-data-prior-to-solve-country_[YEAR]_HS[VERSION].RData # Might be only file for some year folders depending if all countries solved by quadprog
    │   │   │   ├── [RUN YYYY-MM-DD]_analysis-documentation_countries-with-no-solve-qp-solution_[YEAR]_HS[VERSION].txt
    │   │   │   ├── [RUN YYYY-MM-DD]_country-est_[COUNTRY ISO3C]_[YEAR]_HS[VERSION].RDS
    │   │   │   └── ...  
    │   │   └── [YEAR]/
    │   │       └── ...
    │   └── HS[VERSION]/
    │       └── ...
    ├── quadprog_snet/
    │   ├── HS[VERSION]/
    │   │   ├── [YEAR]/
    │   │   │   └── [RUN YYYY-MM-DD]_all-country-est_[YEAR]_HS[VERSION].RDS
    │   │   │   ├── [RUN YYYY-MM-DD]_all-data-prior-to-solve-country_[YEAR]_HS[VERSION].RData
    │   │   │   ├── [RUN YYYY-MM-DD]_analysis-documentation_countries-with-no-solve-qp-solution_[YEAR]_HS[VERSION].txt
    │   │   │   ├── [RUN YYYY-MM-DD]_country-est_[COUNTRY ISO3C]_[YEAR]_HS[VERSION].RDS
    │   │   │   └── ...
    │   │   ├── [YEAR]/
    │   │   │   └── ...
    │   │   └── no_solve_countries.csv # Key file to check   
    │   └── HS[VERSION]/
    │       └── ...
    ├── snet/
    │   ├── HS[VERSION]/
    │   │   └── [YEAR]/
    │   │       ├── [RUN YYYY-MM-DD]_S-net_raw_midpoint_[YEAR]_HS[VERSION].csv
    │   │       ├── [RUN YYYY-MM-DD]_all-country-est_[YEAR]_HS[VERSION].RDS
    │   │       ├── [RUN YYYY-MM-DD]_consumption_[YEAR]_HS[VERSION].csv
    │   │       ├── W_long_[YEAR]_HS[VERSION].csv
    │   │       ├── X_long.csv
    │   │       ├── first_dom_exp_midpoint.csv
    │   │       ├── first_error_exp_midpoint.csv
    │   │       ├── first_foreign_exp_midpoint.csv
    │   │       ├── first_unresolved_foreign_exp_midpoint.csv
    │   │       ├── hs_clade_match.csv
    │   │       ├── reweight_W_long_[YEAR]_HS[VERSION].csv
    │   │       ├── reweight_X_long_[YEAR]_HS[VERSION].csv
    │   │       ├── second_dom_exp_midpoint.csv
    │   │       ├── second_error_exp_midpoint.csv
    │   │       ├── second_foreign_exp_midpoint.csv
    │   │       └── second_unresolved_foreign_exp_midpoint.csv
    │   ├── [YEAR]/
    │   │   └── ...
    │   ├── [YEAR]/
    │   │   └── ...
    │   ├── V1_long_HS[VERSION].csv
    │   └── V2_long_HS[VERSION].csv

After submitting submit_combine_tables_job to AWS batch, the artis-s3-bucket should contain the additional following files:

aws/artis-s3-bucket/
└── outputs/
    ├── cvxopt_snet/
    │   └── ...
    ├── quadprog_snet/
    │   └── ...
    ├── snet/
    │   └── ...
    └── artis_outputs/
        ├── consumption_midpoint_all_hs_all_years.csv 
        └── snet_midpoint_all_hs_all_years.csv
    
# "midpoint"" is estimate type and will change if different estimate type is used

Create AWS IAM User

FIXIT: include screenshots for creating an IAM user with the correct admin permissions.

Create an AWS root user
Create an IAM user group

Create an IAM user

Sign in as IAM user

Get IAM user access key

Save the access key and secret key in a secure location (i.e. password manager)

Docker container `artis-image` details

Once the docker image artis-image has been uploaded to AWS ECR, the docker container artis-image will need to import all R scripts and model inputs from the artis-s3-bucket on AWS. Once $python3 submit_artis_jobs.py is run, a new job on AWS Batch will run ARTIS on a new instance of the docker container for each HS version specified within each job. Each docker instance will only import the scripts and model inputs for the HS version and years it is running from artis-s3-bucket (occurs when docker_image_artis_pkg_download.R is sourced in job_shell_scripts/).

Example directory structure within artis-image:

/home/ec2-user/artis/
│
├── clean_fao_prod.csv
├── clean_fao_taxa.csv
├── clean_sau_prod.csv
├── clean_sau_taxa.csv
├── clean_taxa combined.csv
├── code_max_resolved.csv
├── fao_annual_pop.csv
├── hs-hs-match_HS[VERSION].csv (one file per each HS version)
├── hs-taxa-CF_strict-match_HS[VERSION].csv 
├── hs-taxa-match_HS[VERSION].csv
├── standardized_baci_seafood_hs[VERSION]_y[YEAR]_including_value.csv (one file per HS version/year combination)
├── standardized_baci_seafood_hs[VERSION]_y[YEAR].csv (one file per HS version/year combination)
├── standardized_combined_prod.csv
├── standardized_fao_prod.csv
├── standardized_sau_taxa.csv
│
│(Files pulled from `ARTIS_model_code/` in `artis-s3-bucket`. Folder not retained)
├── 00-aws-hpc-setup_hs[VERSION].R
├── 02-artis-pipeline_hs[VERSION].R
├── 03-combine-tables.R
├── NAMESPACE
├── DESCRIPTION
└── R/
    ├── build_artis_data.R
    ├── calculate_consumption.R
    ├── categorize_hs_to_taxa.R
    ├── classify_prod_dat.R
    ├── clean_fb_slb_synonyms.R
    ├── clean_hs.R
    ├── collect_data.R
    ├── compile_cf.R
    ├── create_export_source_weights.R
    ├── create_reweight_W_long.R
    ├── create_reweight_X_long.R
    ├── create_snet.R
    └── (Add all files)

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
aws_scripts		aws_scripts
data_s3_upload		data_s3_upload
docker_image_files_original		docker_image_files_original
docker_mac_arm64		docker_mac_arm64
docker_mac_x86		docker_mac_x86
images		images
terraform_scripts		terraform_scripts
.gitignore		.gitignore
README.md		README.md
arm64_venv_requirements.txt		arm64_venv_requirements.txt
create_pipeline_versions.sh		create_pipeline_versions.sh
initial_setup.py		initial_setup.py
requirements.txt		requirements.txt
set_model_parameters.sh		set_model_parameters.sh
submit_artis_jobs.py		submit_artis_jobs.py
submit_combine_tables_job.py		submit_combine_tables_job.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARTIS HPC

Table of Contents

Technologies Used

Assumptions:

Installations

Homebrew Installation

AWS CLI Installation

Terraform CLI Installation

Python Installation

Setup Local Python Environment

AWS CLI Setup

Update ARTIS model scripts and model inputs

Setting Up New ARTIS HPC on AWS

Running Existing ARTIS HPC Setup

Combine ARTIS model outputs into CSVs

Download results, Clean up AWS and Docker environments

Checks & Troubleshooting

Status of jobs submitted to AWS batch:

Troubleshoot failed jobs:

Check CloudWatch logs for specific job:

Check for all expected outputs in S3 bucket:

Create AWS IAM User

Docker container `artis-image` details

About

Releases

Packages

Contributors 3

Languages

Seafood-Globalization-Lab/artis-hpc

Folders and files

Latest commit

History

Repository files navigation

ARTIS HPC

Table of Contents

Technologies Used

Assumptions:

Installations

Homebrew Installation

AWS CLI Installation

Terraform CLI Installation

Python Installation

Setup Local Python Environment

AWS CLI Setup

Update ARTIS model scripts and model inputs

Setting Up New ARTIS HPC on AWS

Running Existing ARTIS HPC Setup

Combine ARTIS model outputs into CSVs

Download results, Clean up AWS and Docker environments

Checks & Troubleshooting

Status of jobs submitted to AWS batch:

Troubleshoot failed jobs:

Check CloudWatch logs for specific job:

Check for all expected outputs in S3 bucket:

Create AWS IAM User

Docker container artis-image details

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Docker container `artis-image` details

Packages