Hetnets are networks with multiple types of nodes and edges. This repository creates hetionet v1.0
, which is a hetnet encoding biology, disease, and pharmacology. We created hetionet v1.0
for Rephetio, our project to systematically evaluate why drugs work and to predict new therapeutic uses for existing drugs.
Note: this repository is for building hetionet v1.0
. We recommend that users interested in downloading and using the completed hetnet, do so from the dhimmel/hetionet
repository.
The dependencies are listed in environment.yml
. All available through Anaconda or
PyPI, except for hetio
.
-
precompile.sh
executes notebooks which combine multiple resources into a single type of edge. See the contents ofcompile
for more information. -
build.sh
builds the hetnet, creates permuted derivatives, and exports the hetnet to Neo4j.
integrate.ipynb
creates the hetnet, by integrating data that is stored either incompile
or elsewhere on GitHub. All GitHub links use commit hashes to be version specific. The JSON-formatted hetnet is exported todata/hetnet.json.bz2
.permute.ipynb
loads the created hetnet and creates permuted derivatives that preserve node degree but destroy edge specificity. The permuted hetnets are written todata/permuted
, but are not uploaded due to file size.neo4j-import.ipynb
imports the hetnet and its permutations into separate neo4j instances. These neo4j instances are not uploaded due to file size and licensing issues. Currently, neo4j-community-2.3.3 is used.
data
: the directory containing saved versions of the network.data/summary
: the directory with tables of network statistics. See the summaries of metanodes and metaedges.viz
: the directory containing network visualizations. Includes a holistic network view as well as node degree distributions.
All original content in this repository is released as CC0. However, the hetnet integrates data from many resources and users should consider the licensing of each source. We apply a license
attribute on a per node and per edge basis for sources with defined licenses. However, some resources don't provide any license, so for those we've requested permission. More information is available on Thinklab. See licenses/README.md
for a table of all resources and their licensing.