Skip to content

Review_paper

Kai Blumberg edited this page Jul 29, 2020 · 6 revisions

Ontologies in Support of Metagenomic Annotation and Analysis a Review

Potential new citations:

Probably YES

COVID-19 pandemic reveals the peril of ignoring metadata standards, should mention this in the intro.

The Global Genome Biodiversity Network (GGBN) Data Standard specification Ramona mentioned this in a comment and I should prob include GGBN.

The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification definitely should cite this poster claiming GOLD5 to be mixs compliant cites this. GOLD is fully compliant with the GSC's MIxS standards in capturing metadata and provides a platform to query projects based on various metadata features.

Sequencing data discovery with MetaSeek Should probably add this. Sequencing data discovery tool leveraging MIxS packages to filter by metadata, and search for next-generation (genomic) sequencing datasets.

Toward richer metadata for microbial sequences: replacing strain-level NCBI taxonomy taxids with BioProject, BioSample and Assembly records In January 2014, NCBI will cease assigning strain-level taxids. Instead, submitters are encouraged provide strain information and rich metadata with their submission to the sequence database, BioProject and BioSample. Might help to answer the questions of how NCBITaxon is used now in response to one of Chris' comments.

Standardized Metadata for Human Pathogen/Vector Genomic Sequences Talks about the GSCID/BRC Project and Sample Application Standard ... It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium’s minimal information (MIxS) and NCBI’s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI) Also says By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards.

Microbiome Database (MDB) Chinese equivalent to EBI MGnify, they are using the GOLD vocab to organize their terms, and the NCBITaxon for phylogeny. I think I should probably mention this in a sentence after EBI MGnify. here the link to the preprint for CNGB Sequence Archive (CNSA) CNSA: a data repository for archiving omics data for archiving omics data, including raw sequencing data and its analytical data and related metadata ... Complying with the data standards commonly used in the life sciences, CNSA is committed to building a comprehensive and curated data repository for the storage, management and sharing of omics data, improving the data standards, and providing free access to open data resources for worldwide scientific communities to support academic research and the bio-industry. Looks like China National GeneBank (CNGB) is their equivalent to NCBI/ENA and MDB to MGnify.

MAYBE

The Microbial Antarctic Resource System: Integrating discoverability and preservation of environmentally-annotated microbial 'omics data, mars site Follows MiMarks template maybe worth mentioning?

Using MIxS: An Implementation Report from Two Metagenomic Information Systems talks about EMPO/ENVO EMP ontology (EMPO), that captures the primary axes along which microbial communities tend to be structured (host-associated or not, saline or not). EMPO is an application ontology, with a formally defined W3C Web Ontology Language (OWL) document mapping to existing ontologies, enabling reuse by the microbial ecology community. Could be some fuel to talk about their issues, Also talks about Ecobiomics, should maybe include this?

SeqDB: Biological Collection Management with Integrated DNA Sequence TrackingSeqDB tracks sampling metadata, DNA extractions, and library preparation workflow ... SeqDB implements ... the Genome Standards Consortium (GSC) Minimum Information about any (X) Sequences (MIxS) specification

Ecobiomics: Environmental metagenomic biomonitoring Maybe try to follow up on this for the review paper??

Qiita: rapid, web-enabled microbiome meta-analysis [QITTA]... requires that new studies include a description of the work; relevant publications; collection and processing parameters for each sample; and relevant covariates, based on the MIxS standards.

MIxS-Cryo: Defining a minimum information standard for sequence data from the cryosphere currenly published only as a chapter from Jose's thesis. Should we mention it and cite the thesis?

Things I'm unsure if we would want to mention

Training Biomedical Researchers in Metadata with a MIBBI-Based Ontology generic lightweight ontology based on the Minimum Information for Biological and Biomedical Investigations (MIBBI) standard MIBBI standard is conceptually similar to MIxS but for Biological and Biomedical Investigations, it uses OBI but I'm unsure if we want to extend the scope here and to MIAPPE.

Measures for interoperability of phenotypic data: minimum information requirements and formatting Minimum Information About a Plant Phenotyping Experiment (MIAPPE) like MIxS maybe mention? It does refference the use of a bunch of OBO ontologies but for plant phenotyping not genomics research...

Probably not needed here

Genomic Standards Consortium Projects Describes how to add new GSC projects (I don't think this is really relevant).

Meta-omics data and collection objects (MOD-CO): a conceptual schema and data model for processing sample data in meta-omics research FOG and Pelin on this, I think this might be more relevant for PM paper2.

Minimum Information about ... papers:

I'm guessing I don't need to mention these, unless we think we need an overview paragraph about them.

Minimum Information about an Uncultivated Virus Genome (MIUViG)

Minimum Information about a Biosynthetic Gene cluster

Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

TDWG/biodiversity or Similar abstracts relevant to BCO?

Clarifying Concepts and Terms in Biodiversity Informatics2013 ramona and john

Linking Biodiversity Data with the Biological Collections Ontology Ramona

Genomic Biodiversity Working Interest Group (GBWG) John Deck

Genomic Biodiversity Working Interest Group (GBWG) Meeting John Deck

PlutoF: Biodiversity data management platform forthe complete data lifecycle

The Earth Microbiome Project: Planetary-scale systems ecology.

Standards-based management of specimen-derived DNA sequences SeqDB

Using the Biological Collections Ontology to Advance Biodiversity Science Ramona Maybe relevant here if we want to add more about the use of BCO but I don't want to push it to hard here.

Metadata Standards for Genomic Sequence Data: Past and Future of MIxS Standards Family Pelin 2017 probably not super necessary to cite or is it? While broadly accepted and used by the microbial (ecology) research community, MIxS has several shortcomings, as well as areas that require further development. The GSC is committed to engaging domain experts, in order to: (i) expand coverage and breadth to accommodate new data types and emerging technologies, (ii) maximize usability, (iii) expedite further evolution according to community needs, and (iv) automatize update of MIxS. So maybe some material here?

Genomes OnLineDatabase (GOLD) v.5: an improved metadata management system based on a four level (meta)genome project classification poster so might not be able to cite, but it claims GOLD is fully compliant with the Genomic Standards Consortium (GSC) Minimum Information about any (x) Sequence (MIxS) standards (see Reddy, T. B. K. et al.The Genomes OnLineDatabase (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification. Nucleic Acids Res.1–6 (2014))

Clone this wiki locally