Skip to content

AnacletoLAB/RNA-KG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RNA-KG: An ontology-based KG for representing interactions involving RNA molecules

RNA-KG is a knowledge graph encompassing biological knowledge about RNAs gathered from more than 60 public databases, integrating functional relationships with genes, proteins, and chemicals and ontologically grounded biomedical concepts. RNA-KG can be both used by directly exploring and visualizing the KG, and by applying computational methods to analyze and infer bio-medical knowledge. RNA-KG is constantly maintained and updated with new experimental data.

Metagraph

What Does This Repository Provide?

Notebooks and pointers to (processed) data and ontologies to build the current release of RNA-KG.

Releases



Generate RNA-KG current release

Download Data

RNA-KG is built and maintained using PheKnowLator. PheKnowLator requires three documents within the resources directory to run successfully. Please make sure the documents listed below are present in the specified location prior to constructing RNA-KG. They can all be accessed at the following link: https://doi.org/10.5281/zenodo.10078877.

To generate these data yourself, please see the RNA-KG_Preparation.ipynb and inteRNA-KG_Preparation.ipynb Jupyter Notebooks.

Construct RNA-KG

RNA-KG current release can be generated via the provided main.ipynb Jupyter Notebook. The adopted PheKnowLator's KG build model is shown below.

# PheKnowLator's full build, instance construction approach, with inverse relations, no node metadata, and decode owl (OWL-NETS)
kg = FullBuild(construction='instance',
               node_data='no',
               inverse_relations='yes',
               decode_owl='yes',
               cpus=psutil.cpu_count(logical=True),
               write_location='./resources/knowledge_graphs')

kg.construct_knowledge_graph()

Get In Touch or Get Involved

Contact Us

Don't hesitate to contact us, especially if you believe a new data source should be integrated into RNA-KG. To get in touch with us, please create an issue or send us an email 📩.

Future work

We are currently working on enhancing the proposed KG in different directions.

  • Application of Graph Representation Learning methods to analyze RNA-KG.
  • Identification of key (nodes and edges') properties associated with RNA molecules and their interactors ➞ ⚠️experimental⚠️ Neo4j endpoint available at http://fievel.anacleto.di.unimi.it:7474 (usr: anacleto; pwd: anacleto). The list of RNA-KG nodes including properties is stored in https://RNA-KG.anacleto.di.unimi.it/nodes_with_properties.csv; the list of RNA-KG edges including properties is stored in https://RNA-KG.anacleto.di.unimi.it/edges_with_properties.csv
  • Development of an RNA Ontology with a particular emphasis on non-coding RNA molecules.
  • Specification of our meta-graph in terms of LinkML ➞ SPIRES engine (OntoGPT).
  • Development of graphical facilities for supporting the user in the data acquisition process and thus reducing the manual effort required for mapping the data available in the different data sources into RNA-KG.

Attribution

Licensing

This project is licensed under Apache License 2.0 - see the LICENSE.md file for details.

Citing RNA-KG

Please cite the following paper if it was useful for your research:

@article{Cavalleri2024rnakg,
    title="An ontology-based knowledge graph for representing interactions involving RNA molecules",
    author="Emanuele Cavalleri and Alberto Cabri and Mauricio Soto-Gomez and Sara Bonfitto and Paolo Perlasca and Jessica Gliozzo and Tiffany J. Callahan and Justin Reese and Peter N Robinson and Elena Casiraghi and Giorgio Valentini and Marco Mesiti",
    year="2024",
    journal="Sci. Data",
    publisher="Springer Science and Business Media LLC",
    volume=11,
    number=1,
    pages="906",
    month=aug,
    year=2024,
    copyright="https://creativecommons.org/licenses/by-nc-nd/4.0",
    language="en"
}