Install, build, and run a KaBOB instance via Docker
The KaBOB Knowledge Base of Biology is a formal integration of biological knowledge using Semantic Web standards. Its knowledge is grounded in the community-curated Open Biomedical Ontologies, and it uses this ontological foundation to integrate information mined from a collection of biomedical databases with a concerted effort to model biology separate from database content. More information about KaBOB is available at the KaBOB GitHub page. The original publication describing KaBOB in detail is:
KaBOB: Ontology-Based Semantic Integration of Biomedical Databases
Kevin M Livingston, Michael Bada, William A Baumgartner Jr., and Lawrence E Hunter
BMC Bioinformatics. 2015 Apr 23;16:126. doi: 10.1186/s12859-015-0559-3. PubMedId:25903923
This project facilitates the installation and construction of a KaBOB instance via Docker.
- The current build procedure requires the use of the AllegroGraph graph database, and thus requires a license for AllegroGraph. Without a license, the default triple limit of AllegroGraph will cause the build to terminate prematurely.
- This project is set up to build an instance of KaBOB based on human data. Future extensions of this project will parameterize the species on which KaBOB instances can be based.
- The scripts in this project assume that the host machine is Unix-based
-
Install Docker on the machine that will host KaBOB
-
Download this repository:
git clone --branch v0.2 https://github.com/bill-baumgartner/kabob.app ./kabob.app.git
-
Follow the instructions in
kabob.app.git/allegrograph/build/config/user-env.sh.example
to create auser-env.sh
file with your AllegroGraph license. Place the newly createduser-env.sh
file in the same directory asthe user-env.sh.example
file.At this point, the KaBOB build is ready to proceed via a succession of scripts that call Docker commands. All scripts should be run from the base directory of the project:
cd kabob.app.git
Run: scripts/step1_rdf-gen.sh -k KEY -c n -d DRUGBANK_XML_FILE -p PHARMGKB_RELATIONSHIPS_FILE
where:
- KEY is a user-defined key to uniquely identify the KaBOB build. This key enables multiple KaBOB instances to be run in the same Docker environment. Example keys may be "development" or "production". Keys must not contain whitespace.
- n is the number of docker containers (1-5) that will be used to generate RDF. n should be <= the number of cores available on your machine.
- DRUGBANK_XML_FILE is the path to the DrugBank 'full database.xml' file on the local file system. The DrugBank 'full database.xml' file can be downloaded from here after creating an account and agreeing to the DrugBank license. This argument is optional. If the user prefers to exclude DrugBank from the KaBOB build then this argument can be excluded.
- PHARMGKB_RELATIONSHIPS_FILE is the path to the PharmGKB relationships file (relationships.tsv) on the local file system. Use of this file requires a PharmGKB license which can be obtained here. Note: This argument is optional. If the user prefers to exclude the PharmGKB relationships from the KaBOB build then this argument can be excluded.
This step may take >90 min.
Run: scripts/step2_ag-setup.sh KEY
where:
- KEY is the same user-defined key specified in Build Step 1 above that uniquely identifies the KaBOB build.
At this point, AllegroGraph should be running and its WebView UI should be visible at http://[HOST_URL]:10035, where [HOST_URL] is the URL for the machine hosting KaBOB. Access credentials for logging into AllegroGraph can be found in the
user-env.sh
file created earlier in this step.
Run: scripts/step3_build-kabob.sh KEY
where:
- KEY is the same user-defined key specified in Build Step 1 above that uniquely identifies the KaBOB build.
Building the human KaBOB instance should take ~100 minutes. If you would like to follow along via the agraph logs you can login to the agraph container using
docker exec -ti agraph bash
and then view the agraph log output usingtail -f /tmp/agraph_load_check---supervisor-MKGnli.log
(note the name of the log file may be slightly different)