-
Notifications
You must be signed in to change notification settings - Fork 0
Setting up JAMO
The jamo
directory contains all the necessary files for setting
up JAMO at NERSC. Running JAMO at NERSC consists of three main steps:
- Building the JAMO application as a Docker image. The relevant files for this can be found in
jamo/docker
. - Instantiating JAMO on NERSC's Spin infrastructure. The necessary files for this can be found in
jamo/k8s
andjamo/config
. - Running the data transfer service on Perlmutter and the Data transfer nodes. The necessary files for this can be found in
jamo/dt_service
.
Before charging ahead, make sure you have access to Spin, NERSC's container based platform for service deployment. You will need to attend a SpinUp Workshop. It will also be useful to get access to NERSC's private container registry.
Before getting started, you will need to set up an account on JGI's Gitlab and set up a personal access token (PAT). You can create an account on JGI's Gitlab by going here and signing in with your LBL LDAP. Once you are signed in, click on your avatar and go to Preferences->Access Tokens->Add Token. Give the token a name and set an expiration date. Under Select scopes, select the box for read_repository.
After checking out this repository, change into jamo/docker
. From here, set USER
and PAT
environment variables to your
JGI Gitlab username and PAT, respectively, and run get_code.sh
to retrieve the all the necessary code for running JAMO.
USER=<JGI Gitlab Username> PAT=<Gitlab personal access token> bash get_code.sh
Now that you have all the JAMO code, you can build a Docker image for running JAMO at NERSC. You will need to make your image available from Spin at some Docker registry. This tutorial uses NERSC's private registry. See the NERSC documentation for getting access to this registry.
Once you have access, sign in to the registry:
docker login registry.nersc.gov
Now build and push your image to the registry:
docker build -t registry.nersc.gov/<PID>/jamo-service:<TAG> --push .
You will need to set a tag (i.e. <TAG>
). Please see the registry for current set of tags and choose a nonexistent tag. You will also
need to fill in the NERSC project ID.
If you are building your image locally, you will probably need to use docker buildx
to build your image for multiple platforms or, at the very
least, build for the platform running NERSC Spin (i.e. linux/amd64
). Below is an example of how you would build an image for running on Apple silicon
and a Linux machine.
docker buildx build --platform linux/amd64,linux/arm64 -t registry.nersc.gov/<PID>/jamo-service:<TAG> --push .
NERSC's Spin infrastructure uses Rancher, a management and orchestration framework for Kubernetes clusters. The jamo/k8s
directory contains
the Kubernetes configuration files for setting up all the components necessary for running JAMO. The Rancher Objects list below indicates the
YAML file, which type of Kubernetes object to create in Rancher, and a brief description of what the object's purpose and/or how to augment the
config file for your instance.
- Secret: Storage->Secrets
- ConfigMap: Storage->ConfigMaps
- PersistentVolumeClaim: Storage->PersistentVolumeClaims
- Deployment: Workloads->Deployments
- Ingress: Service Discovery->Ingresses
You will need to update the namespace for all objects. Unless otherwise noted (e.g. Deployments), you will need to modify the YAML path metadata.namespace
to reflect the name of your namespace.
-
secrets/jamo-dev-cert.yaml
- Secret containing the JAMO service SSL certificate. To create this Secret, you will need to request a certificate for your service. Before you do that, you will need to pick a Fully Qualified Domain Name or FQDN. This is the DNS name you will submit to LBLnet (step 12). Once you know what you want your JAMO host (a.k.a. FQDN, DNS name, URL) to be, you can request a certificate through the lab here. To request a certificate, you will need to generate a certificate signing request (CSR). You can ask your favorite AI chatbot on how to do this or see instructions here. There are more details on requesting certificates available at the Berkeley Lab Commons.After submitting your request,you should recieve ane email from Sectigo Certificate Manager with links for downloading your certificate. You will need to download the Certificate (w/ issuer after), PEM encoded option. Create this secret through the Rancher UI by creating a TLS Certificate (i.e. Storage->Secrets->Create->TLS Certificate), and copying the PEM encoded certificate into the Certificate form.
-
secrets/google-oauth-secrets.yaml
- Secret containing Google OAuth secrets. You can set this up here. JAMO needs this for setting up accounts with Google credentials. You will need to update the following in the YAML file:- The OAuth Secrets. This is the base64 encoding of the JSON string of the OAuth secrets you downloaded from Google. You can also create a new secret through the Rancher UI with the name
google-oauth-secrets
and copy your JSON string in there.
- The OAuth Secrets. This is the base64 encoding of the JSON string of the OAuth secrets you downloaded from Google. You can also create a new secret through the Rancher UI with the name
-
secrets/sf-api-key.yaml
- Secret for connecting to the Superfacility API. You can create one of these here. JAMO needs this for connecting Google credentials to NERSC accounts. You will need to update the following in the YAML file:- The Superfacility API key. This is the base-64 encoding of the PEM string of the API key. You can also create a new secret through the Rancher UI with the name
sf-api-key
and copy your PEM string in there.
- The Superfacility API key. This is the base-64 encoding of the PEM string of the API key. You can also create a new secret through the Rancher UI with the name
-
secrets/jamo-mongo-pass.yaml
- Secret containing a password for MongoDB. This can be anything--you will not need to remember it or write it down anywhere else. You will need to update the following in the YAML file:- The MongoDB password (YAML path
data.password
). This is the base-64 encoding of the password. You can also create a new secret through the Rancher UI with the namejamo-mongo-pass
.
- The MongoDB password (YAML path
-
secrets/jamo-mysql-pass.yaml
- Secret containing a password for MySQL. This can be anything--you will not need to remember it or write it down anywhere else. You will need to update the following in the YAML file:- The MySQL password (YAML path
data.password
). This is the base-64 encoding of the password. You can also create a new secret through the Rancher UI with the namejamo-mysql-pass
.
- The MySQL password (YAML path
-
cm/sql-config.yaml
- ConfigMap for storing MySQL configurations. This following components in this YAML file will need to be updated:- The MySQL user and the LapinPy Core and Tape database names in the initialization SQL statements (YAML path
data.init.sql
).
- The MySQL user and the LapinPy Core and Tape database names in the initialization SQL statements (YAML path
-
pvc/jamo-mongo.yaml
- PersistentVolumeClaim (i.e. storage) for MongoDB to write to. You can also create a new PVC through the Rancher UI with the namejamo-mongo
. -
pvc/jamo-mysql.yaml
- PersistentVolumeClaim (i.e. storage) for MySQL to write to. You can also create a new PVC through the Rancher UI with the namejamo-mysql
. -
deployments/jamo-mongo.yaml
- Deployment for MongoDB. This following components in this YAML file will need to be updated:- The deployment namespace (YAML path
metadata.namespace
andspec.template.metadata.namespace
) - The MongoDB root username (YAML path
spec.template.metadata.spec.containers.env.value
wherespec.template.metadata.spec.containers.env.name == MONGO_INITDB_ROOT_USERNAME
)
- The deployment namespace (YAML path
-
deployments/jamo-mysql.yaml
- Deployment for MySQL. This following components in this YAML file will need to be updated:- The deployment namespace (YAML path
metadata.namespace
andspec.template.metadata.namespace
) - The MySQL username (YAML path
spec.template.metadata.spec.containers.env.value
wherespec.template.metadata.spec.containers.env.name == MYSQL_USER
) - The MySQL database name (YAML path
spec.template.metadata.spec.containers.env.value
wherespec.template.metadata.spec.containers.env.name == MYSQL_DATABASE
)
- The deployment namespace (YAML path
-
deployments/jamo-app.yaml
- Deployment for running the JAMO service. This following components in this YAML file will need to be updated:- The deployment namespace (YAML path
metadata.namespace
andspec.template.metadata.namespace
) - The URL of the JAMO image to use (YAML path
spec.template.spec.containers.image
) - The user to run the containers as (YAML path
spec.template.spec.containers.securityContext.runAsUser
). For Taskforce5's JAMO, We use thet5user
collaboration account from the m4521 project. - The pod storage (YAML path
spec.template.spec.volumes
). Updatevol-jamo-config
andvol-jamo-wd
to use paths on the community file system that you have access to
- The deployment namespace (YAML path
-
ingresses/jamo-ingress.yaml
- Ingress for JAMO service to access connnections. You can create this through the Rancher UI. You will need to create the standard ingress host first. Once you have your ingress, you can request a DNS name from LBLnet, at which point you can update your ingress.- The ingress name (YAML path
metadata.name
and the Spin URL (YAML pathspec.rules.host
wherespec.rules.host == .*svc.spin.nersc.org
). The host URL needs to conform to the following pattern:<NAME>.<NAMESPACE>.<INSTANCE>.svc.spin.nersc.org
, whereNAME
is the service name,NAMESPACE
is your namespace, andINSTANCE
is the Rancher instance (i.e.development
orproduction
) - The DNS name for your JAMO service (YAML path
spec.rules.host
). You will need to update this after you get a DNS name from LBLnet. LBLnet will not add a DNS record for a CNAME that does not exist, so you must create the ingress first to get a CNAME (i.e.<NAME>.<NAMESPACE>.<INSTANCE>.svc.spin.nersc.org
) for them to point your DNS name at. - The TLS certificate (YAML path
spec.tls
). This must be the TLS Certificate Secret you created in step 1.
- The ingress name (YAML path
Once you have the JAMO service up and running, you need to start the data transfer service or the DTS. The DTS does the work of ingesting, backing up, and restoring files.
Scripts for running DTS are located in jamo/dt_service
. Since the majority of the work being done by the DTS is transfering data, it is best practice to run this on the data transfer nodes (DTNs).
The DTNs do not mount Perlmutter Scratch, so we also need to run a service on Perlmutter to ingest data submitted to JAMO from there.
To keep these services alive, we run cron jobs. Perlmutter does not allow direct running of cron jobs. Instead, you need to use scrontab
.
-
dt_service/run_dt_service_nersc_prod.sh
- script for running on DTNsdtn03
anddtn04
. This script will check for a running DTS process and start one if no such process exists. -
dt_service/crontab.dtn.sh
- crontab script for runningrun_dt_service_nersc_prod.sh
every 2 minutes. This file will need to be modified to point torun_dt_service_nersc_prod.sh
and the directory you want save logs to. -
dt_service/run_dt_service_perlmutter_prod.sh
- script for running on Perlmutter. This script will run a DTS process; it does not check for an existing process. -
dt_service/scrontab.perlmutter.sh
- scrontab script for runningrun_dt_service_perlmutter_prod.sh
. Submit this to SLURM usingscrontab
. This file will need to be modified to point toscrontab.perlmutter.sh
and the directory you want to save logs to. You will need access to the workflow queue to submit this script as is. If you do not have access to the workflow queue, you can submit to the cron queue, but you will have to alter the time request.