Skip to content

EBI-Metagenomics/mett-dataportal

Repository files navigation

ME TT Data Portal

The transversal theme aims at mechanistically understanding the complex role that human-associated microbiomes play in human health and disease. Our current knowledge of bacterial gene functions come primarily from very few model bacteria, failing to capture the genetic diversity within the gut microbiome. One of the goals of METT is to systematically tackle the vast genetic matter in the gut microbiome and two establish new model microbes. The Flagship Project of METT has focused efforts on annotating the genomes of Phocaeicola Vulgatus and Bacteroides uniformis, two of the most prevalent and abundant bacterial species of the human microbiome.

The current version is a web-based genomic annotation editing platform designed to browse the genomes of the type strains B. uniformis (ATCC8492) and P. vulgatus (ATCC8482). The annotation data generated by the ME TT has been organised on an FTP directory hosted at EBI and contains structural annotations (such as Prokka and Mobilome predictions, etc.) as well as functional annotations (including biosynthetic gene clusters, carbohydrate active enzymes, etc.).

Data Portal API

Requirements

  • Python Version: This project requires Python 3.12. Please ensure that you have this version installed to avoid compatibility issues.
  • You can download the latest version here.

In Development - Intermediate Stage

Development Environment

Dependencies installation -

pip install -r requirements-dev.txt
pre-commit install

Steps to bring up the local environment

  • Migration files are in repo. use python manage.py migrate to setup the tables
  • Use import scripts to import the data from FTP server. Ref: How to import
  • Create indexes for Fasta and GFF files. Ref: How to generate indexes
  • Run djando sever python manage.py runserver
  • Run react ./dataportal-app app using npm start

Configuration

We use Pydantic to formalise Config files.

  • config/local.env as a convenience for env vars.

Import Species, Strains and Annotations

Scripts -

$ python manage.py import_species
$ python manage.py import_strains_contigs --ftp-server "ftp.ebi.ac.uk" --ftp-directory "/pub/databases/mett/all_hd_isolates/deduplicated_assemblies/" --set-type-strains BU_ATCC8492 PV_ATCC8482
$ python manage.py import_annotations --ftp-server ftp.ebi.ac.uk --ftp-directory /pub/databases/mett/annotations/v1_2024-04-15/ 
$ python manage.py import_annotations --ftp-server ftp.ebi.ac.uk --ftp-directory /pub/databases/mett/annotations/v1_2024-04-15/ --isolate BU_CCUG35501
$ python manage.py import_annotations --ftp-server ftp.ebi.ac.uk --ftp-directory /pub/databases/mett/annotations/v1_2024-04-15/ --assembly BU_ATCC8492
$ python manage.py import_essentiality

Code style

Use Black. Use Ruff. These are both configured if you install the pre-commit tools as above.

To manually run them: black . and ruff check --fix.

Testing

pip install -r requirements-dev.txt
pytest

Initial Database table setup

python manage.py makemigrations
python manage.py makemigrations dataportal --empty
python manage.py migrate

Data Portal APP (React based application)

Requirements