Skip to content

Latest commit

 

History

History
368 lines (291 loc) · 17.1 KB

CONTRIBUTING.md

File metadata and controls

368 lines (291 loc) · 17.1 KB

Introduction

This guide will provide instruction and information on setting up the development environment for the BioData Catalyst (BDC) Data Management Core (DMC) Data Submission Tracker (DST). This will include installing prerequisites, setting up a development environment, building the container, deploying to BioData Catalyst, running tests/linting, and information about contributing to development.

At this point all installation is managed system-wide. I think this is a poor way to manage a development environment. I would prefer that we use an environmental encapsulation of some sort, such as poetry or pyenv. We will try to move towards some encapsulation after we have an initial working environment.


Contributing to the Project

To contribute to the project, follow the steps outlined in the Setup the Development Environment section to create a local development environment. Once your environment is set up, you can make changes to the codebase and submit pull requests for review. If you encounter any issues during the setup process or while working on the project please submit an issue and describe the steps you are having trouble with.


Setup the Development Environment

Setup for the development environment requires satisfying some prerequisites and dependencies in order to provide services. First we will need to satisfy the Prerequisites, including Dependencies, Optional Dependencies, and Provision PostgreSQL. Please follow these instructions to properly set up and verify a functioning development environment.

Prerequisites

In order to build and run the Data Submission tracker dependencies will need to be installed and requirements for the environment will need to be prepared. First we will describe the required dependencies, after these are installed we can set up the appropriate environment to build, run, and test the DST.

Dependencies

The DST dependencies are necessary both for building and deployment as well as for development. The primary development environment is currently Ubuntu 22.04 but other environments will be added as requested. Currently, development of the Data Submission Tracker requires these software tools available or installed on the development system.

  • Docker

See Install Prerequisites section for how to install and set up each of these dependencies.

Optional Dependencies

Because the Data Submission Tracker runs in a Docker container and all the relevant code is run within the container these additional dependencies are not strictly necessary for installation on the development system. However, for testing and troubleshooting we recommend also installing the following dependencies.

  • Python v3.10.6 or higher
  • Django v4.1.4 - higher version not currently recommended

I recommend using pyenv and venv to manage the python version and virtual environment. For detailed instructions on how I set up my Python development see Python Development Environment.

Provision PostgreSQL

The DST requires an external PostgreSQL accessed via HTML. Previously, this project used an Ansible script to set up the project on an existing Google Cloud Platform Compute instance (for the last commit with this provisioning see #last_gcp). The project now uses Docker to set up the PostgreSQL database. This is a more portable solution and allows for easier testing and development.

Repository Setup

With the prerequisites installed, we are now ready to clone and set up the repository for development. Navigate to where you want the Git repository located on your system and clone the repository with the following command.

git clone [email protected]:amc-corey-cox/BDC_Dashboard.git
cd BDC_Dashboard

I recommend setting up pyenv and a virtual environment using venv for development. See Python Development Environment for more information on how I set up my Python development environment. If you are following my recommendations you can set up a local pyenv and virtual environment with the following commands.

pyenv install 3.11.1
pyenv local 3.11.1
python -m venv .venv
source venv/bin/activate
poetry install

Environment Variables

For development purposes a number of environment variables need to be set. In the api folder create a .env file with the following data.

# Environment variables
# For local development, the api directory should have an .env file with the following:

# Set to True for local dev and False for prod
DEBUG=True

# Set to DEBUG for local dev and INFO for prod
DJANGO_LOG_LEVEL=DEBUG

# The SECRET_KEY generated by Django
SECRET_KEY=

# The Postgres database name
POSTGRES_DB=tickets

# The username of the Postgres User
POSTGRES_USER=bdc_db_user

# A (secure) password for the Postgres User
POSTGRES_PASSWORD=

# The external IP for the Compute Engine instance with Postgres
POSTGRES_HOST=bdc-dashboard-db

# The port for the Postgres Database
POSTGRES_PORT=5432

# The base URL for the Jira API
JIRA_BASE_URL='https://'

# The token for Authorization in the Jira API
JIRA_TOKEN=''

# The ID of the board in Jira where data will be collected
JIRA_BOARD_ID=''

# The project ID for the Jira data
JIRA_PROJECT=''

# The issue type for epic issues in Jira
JIRA_EPIC_ISSUETYPE=10000

You will need to update SECRET_KEY, POSTGRES_PASSWORD, JIRA_BASE_URL, JIRA_TOKEN, JIRA_BOARD_ID, and JIRA_PROJECT with settings appropriate to your configuration. If you don't have a Django SECRET_KEY you can create one with the following command (Django local installation required).

python -c 'from django.core.management.utils import get_random_secret_key; print(get_random_secret_key())'

Other settings may need to be adjusted depending on your environment or during deployment.

Build Tracker Docker Container

With the PostgreSQL database set up and running and the environment variables set, the repository should be ready for development. To test the development environment, we will build the Docker container and access the application.

First, build the Docker container using docker-compose.

docker-compose up --build -d

To access the application navigate to http://localhost:8000/ in your browser. You should see a login screen for the application with a button for NIH login. Login will not work at this time. In order to log in to the application we'll need to set the Django superuser. First, enter the local Docker container shell.

docker exec -it bdc-dashboard-app /bin/bash

Then create a Django superuser on the Docker container.

python manage.py createsuperuser

Create a superuser with your desired credentials, generally your e-mail address and a password.

After creating the superuser you can authenticate on the Django app by navigating to http://localhost:8000/admin. Once authenticated, access the app at http://localhost:8000/ to navigate the full application site. If all of these steps are successful you are now ready to begin development on the DMC Tracker app.


Install Dependencies

Docker

Docker is an application containerization environment that allows software to be built in containers and deployed in different environments reducing dependencies and creating a more secure runtime environment by virtue of isolation from the host architecture. The Docker ecosystem provides both tools to create a container image and an engine to run those images on a target system. For basic build and development you will only need the container image creation tools. However, for proper testing and to allow access to the software in a development environment we will install both the image creation and engine portions.

Uninstall unofficial packages or conflicting dependencies

Some distributions have unofficial Docker packages installed or dependencies that Docker will install separately. We need to uninstall these to prevent conflicts.

for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt remove -y $pkg; done

I had a stub installation to satisfy another packages spurious dependency. Here's how to check if a docker command still exists.

command -v docker

If the command still exists it is probably a stub. To check try running it. If there is no output, or a message that it isn't a real docker installation, check the file by opening it. If there is a stub file delete it. Replace '.local/bin/docker' with the path to your docker stub from the command above.

rm -rf .local/bin/docker

System Requirements

Docker requires a 64-bit kernel (common on modern systems), 4 GB of RAM, configuring ID mapping in user namespaces enabled. The Docker Desktop also requires a systemd init, and a desktop environment.

Software Requirements

Docker requires KVM virtualization support and QEMU version 5.2 or newer, latest recommended.

KVM Support

First check KVM support by loading the module with the following command.

modprobe kvm

Then load the module specific to your systems' processor.

modprobe kvm_intel  # Intel processors
modprobe kvm_amd    # AMD processors

If no errors are reported double-check the modules are enabled.

lsmod | grep kvm

Output for this command should look similar to that below.

kvm_amd               167936  0
ccp                   126976  1 kvm_amd
kvm                  1089536  1 kvm_amd
irqbypass              16384  1 kvm

Next check ownership of the kvm device.

ls -al /dev/kvm

Add your user to the kvm group in order to access the kvm device.

sudo usermod -aG kvm $USER

You can check to make sure your user was added to the kvm group.

grep kvm /etc/group

Update QEMU to latest

Docker recommends updating QEMU to the latest version and requires at least version 5.2. Check your current version of QEMU.

/usr/bin/qemu-system-x86_64 --version

If this gives an error (mine did), you need to install QEMU.

sudo apt install -y qemu-system-x86

Unless you experience problems it is probably best to use the version of QEMU that is installed by your distribution. You can check the version.

kvm --version

For Ubuntu, the current version is 6.2. I'm currently using this for development and will update this file if I have any problems or decide to upgrade. The latest version as of this writing is 8.0.2.
I have also installed some other recommended virtualization packages that may be useful or necessary for running and testing VMs locally.

sudo apt install -y qemu-kvm libvirt-clients libvirt-daemon-system bridge-utils virtinst libvirt-daemon

These may not be required and could even create conflicts but all the information I found on installing QEMU suggested installing these as well. These sources also recommend enabling libvirtd.

sudo systemctl enable --now libvirtd

Sources also recommended installing virt-manager but I'll be using Docker Desktop to manage VMs so I'm skipping this for now.

Install Docker

You can install the Docker packages from a package by downloading the package from the Docker Linux Install page. I prefer to manage my installation with apt.

Prepare for installation

Make sure everything is up-to-date and allow using a repository over HTTPS

sudo apt update
sudo apt install -y ca-certificates curl gnupg

Add Docker GPG Key

We need the Docker official GPG public key to use their apt repository.

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

Set up the Docker Apt Repository

This will set up the Docker Apt Repository allowing ongoing updates of Docker using the system software updater.

echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Update Apt to fetch Docker Repository

We need to update the Apt cache in order to install from the Docker repository.

sudo apt update

This should show a line accessing downloader.docker.com for the systems installed release.

Install Docker and tools

Now we can install Docker Engine, containerd, and Docker Compose. This will install the latest version, which is currently version 24.0.2.

sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose docker-compose-plugin

If you need to install a different version see the Docker Engine Installation

Now let's test the docker installation.

sudo docker run hello-world

Fix Docker permissions

Docker requires root access to run. This is a security risk and we need to fix it. The recommended way to do this is to add your user to the docker group. This will allow your user to run docker commands without sudo.

sudo usermod -aG docker ${USER}

You will then need to log in again or run the command below to gain the group permissions.

newgrp docker

Install Docker Desktop

Docker Desktop is a GUI for managing Docker containers and VMs.

sudo apt install -y gnome-terminal
sudo apt remove docker-desktop
rm -r $HOME/.docker/desktop
sudo rm /usr/local/bin/com.docker.cli
sudo apt purge docker-desktop

Download the latest version of Docker Desktop from the Docker Desktop page. The latest version as of this writing is 4.1.1.

sudo apt update
sudo apt install -y ./docker-desktop-<version>-<arch>.deb

Python Development Environment

The DST is written in Python and uses the Django framework. This section will cover setting up a Python development environment for the DST. Because of the complexity of this project, I recommend setting up pyenv, venv, and poetry to manage the Python environment. This will also allow you to install the exact versions of Python and Python packages required for the DST.

pyenv

pyenv is a Python version manager. It allows you to install and manage multiple versions of Python on the same system. It also allows you to install the exact version of Python required for a project. This allows you to use the same version of Python in development and as the DST will use in production.

First, we need to install the python build dependencies.

sudo apt update; sudo apt install -y build-essential libssl-dev zlib1g-dev \
libbz2-dev libreadline-dev libsqlite3-dev curl \
libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev

Now we can install pyenv. This will install pyenv to the ~/.pyenv directory.

curl https://pyenv.run | bash

Now we need to add pyenv to the PATH. Add the following to the end of your ~/.bashrc file.

export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init --path)"

Now we can install the version of Python required for the DST.

pyenv install 3.11.1

To use pyenv local to your repository you can run the following command in the root of your repository, cd /path/to/repository.

pyenv local 3.11.1

venv

venv is a Python module that allows you to create virtual environments for Python. This allows you to install Python packages for a specific project without affecting the system Python installation. This is useful for isolating the Python environment for the DST.

To create a virtual environment for the DST run the following command in the root of your repository.

python -m venv .venv

I create a file named venv_name.txt in the root of the venv directory. This file contains the name of the virtual environment. I use this in my .bashrc to read the name and track which virtual environment I have active in $PS1. This is optional but I find it useful. If you would like to know how to do this please reach out and I'll share my .bashrc.

echo "dst" > .venv/venv_name.txt

To activate the virtual environment run the following command in the root of your repository.

source .venv/bin/activate

To deactivate the virtual environment run the following command in the root of your repository.

deactivate

Poetry

Poetry is a Python dependency manager. It allows you to manage Python dependencies for a project. Poetry will also create a virtual environment for the project and install the dependencies in that environment. This allows you to install the exact versions of dependencies required for the DST.

Install poetry with pip in the virtual environment then initialize with the following commands.

pip install poetry
poetry init

Set up the

Reformat requirements.txt and install in Poetry virtual environment with the following command.

poetry add $(sed -E 's/;.*$//; s/\[.*\]//g' api/requirements.txt)
poetry add --extras "grpc" google-api-core
poetry add --extras "crypto" pyjwt
poetry install