-
Notifications
You must be signed in to change notification settings - Fork 4
Deployment how tos
(Note: This example is for KG2.10.0, but the steps should be analogous for future KG2 versions.)
Start the kg2cplover.rtx.ai
ec2 instance and run the following:
ssh [email protected]
cd PloverDB/
sudo docker start plovercontainer2.10.0
If the above gave some sort of error, instead try this:
sudo docker stop plovercontainer2.10.0
sudo docker rm plovercontainer2.10.0
sudo docker run -d --name plovercontainer2.10.0 -p 9990:80 ploverimage2.10.0
Wait about 5 minutes for the indexes to finish loading. You can check the logs with:
sudo docker logs plovercontainer2.10.0
When it's ready the last few lines of the log should look something like this:
2022-03-02 00:25:58,807 INFO: Indexes are fully loaded! Took 5.27 minutes.
WSGI app 0 (mountpoint='') ready in 317 seconds on interpreter 0x pid: 11 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 11)
spawned uWSGI worker 1 (pid: 14, cores: 1)
spawned uWSGI worker 2 (pid: 15, cores: 1)
running "unix_signal:15 gracefully_kill_them_all" (master-start)...
While you're waiting for the above command to finish, you can point the ARAX code to this Plover instead of the ITRB Plover:
- Change all three Plover URLs in the "plover" slot of
RTX/code/config_kg2c.json
to this Plover endpoint (https://kg2cplover.rtx.ai:9990
) - Push that change to
master
- Roll
master
out to the/kg2
and/kg2beta
endpoints on arax.ncats.io
At this point, once Plover has finished loading indexes, /kg2
and /kg2beta
should be running normally again.
(Note: This example is for KG2.10.0, but the steps should be analogous for future KG2 versions.)
Create a new branch in the PloverDB repo for this KG2 version - we'll name ours kg2.10.0c
for this example:
git checkout -b kg2.10.0c
- Log into
kg2webhost.rtx.ai
using ssh:
(if you have not done this before, someone with ssh access to that instance will need to add your ssh public key to the authorized_keys
file for the user ubuntu
on that system, before you can ssh in).
- Copy the KG2c file, and in the process rename it with the new version number, into the webroot directory:
aws s3 cp s3://rtx-kg2/kg2c_lite.json.gz ./nginx-document-root/kg2c_lite_2.10.0.json.gz
where "2.10.0" represents the new KG2c version number that you are aiming to deploy in PloverDB.
- From your laptop, do a test download of KG2c from
kg2webhost.rtx.ai
curl https://kg2webhost.rtx.ai/kg2c_lite_2.10.0.json.gz -o kg2c_lite_2.10.0.json.gz
gunzip --list kg2c_lite_2.10.0.json.gz
if you get an error, check to see if your gzipped file maybe contains HTML from a 404 error.
The kg2webhost.rtx.ai
system
is a t2.micro
instance (Ubuntu 20.04 AMI) running in the us-east-1
AWS region (Virginia), with
200 GiB of EBS storage.
Currently, Amy, Sundar, and Steve have RSA public keys installed to be able to ssh in. It has nginx
installed and the DocumentRoot directory is /var/www/kg2webhost
(owned by user ubuntu
). AWS CLI
is installed and configured for user ubuntu
in the directory /home/ubuntu/venv/bin/aws
(note, that AWS CLI installation is configured to have default region of us-west-2
, since
that is where our main KG2 S3 bucket is located). Nginx is configured with HTTPS in this instance,
with the SSL certificate being managed by certbot. The crontab for renewing the cert is located in
/etc/cron.d/certbot
. Currently, this nginx webserver is only used for hosting kg2c_lite_2.X.X.json.gz
so that the PloverDB plover.py
module can curl
in the file at app start-up.
-
In
app/config_kg2c.json
, change-
"nodes_file": "https://kg2webhost.rtx.ai/kg2c-2.10.0-v1.0-nodes.jsonl.gz"
to"https://kg2webhost.rtx.ai/kg2c-2.10.1-v1.0-nodes.jsonl.gz"
, or whatever exactly the new KG2c JSON file is called in the new KG2c lite JSON file that you are hosting inkg2webhost.rtx.ai
. -
"edges_file": "https://kg2webhost.rtx.ai/kg2c-2.10.0-v1.0-edges.jsonl.gz"
to"https://kg2webhost.rtx.ai/kg2c-2.10.1-v1.0-edges.jsonl.gz"
, or whatever exactly the new KG2c JSON file is called in the new KG2c lite JSON file that you are hosting inkg2webhost.rtx.ai
.
-
-
Commit and push this change to your branch
-
Make any other Plover code changes that this new KG2 version necessitates in your branch (usually only needed if KG2's core schema changed)
Then pick an EC2 instance to serve this new Plover from. Generally we use kg2cplover.rtx.ai
, but if that instance is already serving a different version of Plover that needs to remain live (i.e., it is being called by one of the RTX-KG2 instances), then you can use kg2cplover2.rtx.ai
. Note that you can tell which Plover a given KG2 instance is using by running a query in that KG2 UI and looking at the DEBUG log messages:
We usually deploy an updated PloverDB into one of our team's self-hosted PloverDB instances in EC2 first, and then once we have a completely working RTX-KG2 KP and ARAX based on the updated (self-hosted) PloverDB, we will eventually merge the updated PloverDB code into master
which will trigger deployment into ITRB CI.
Start the Plover EC2 instance (this example uses kg2cplover.rtx.ai
) and run the following (with your branch name/version number subbed in):
ssh [email protected]
cd PloverDB/
git fetch
git checkout kg2.10.0c
screen
bash -x run.sh ploverimage2.10.0 plovercontainer2.10.0 "sudo docker"
The build should take around 50 minutes to finish.
After it's done, verify the new Plover service is working by running the test suite against it. From your own machine (assuming you have cloned the PloverDB
repo and done pip install -r requirements.txt
):
cd PloverDB/
pytest -v test/test_kg2c.py --endpoint https://kg2cplover.rtx.ai:9990
(NOTE: If you loaded Plover onto the kg2cplover2.rtx.ai
instance, use this endpoint URL instead: http://kg2cplover2.rtx.ai:9990)
Note that sometimes tests need to be updated due to changes in the new KG2 version, though the majority of tests should pass. For any failing tests, ensure they're failing due to expected topological changes in the new KG2 version; if so, tweak them to get them passing again (via adjusting pinned curies, predicates, or whatever makes sense).
When we're ready for the ITRB CI Plover instance to be running this new KG2 version, merge your branch into main
. This should automatically deploy to the ITRB CI Plover (allow about an hour for it to rebuild). Ping Kanna and/or Pouyan in Slack to update the ITRB Test and Prod Plovers.