Skip to content

Review_paper

Kai Blumberg edited this page Jul 28, 2020 · 6 revisions

Ontologies in Support of Metagenomic Annotation and Analysis a Review

Potential new citations:

Probably YES

COVID-19 pandemic reveals the peril of ignoring metadata standards, should mention this in the intro.

The Global Genome Biodiversity Network (GGBN) Data Standard specification Ramona mentioned this in a comment and I should prob include GGBN.

The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification definitely should cite this poster claiming GOLD5 to be mixs compliant cites this. GOLD is fully compliant with the GSC's MIxS standards in capturing metadata and provides a platform to query projects based on various metadata features.

Sequencing data discovery with MetaSeek Should probably add this.

Toward richer metadata for microbial sequences: replacing strain-level NCBI taxonomy taxids with BioProject, BioSample and Assembly records In January 2014, NCBI will cease assigning strain-level taxids. Instead, submitters are encouraged provide strain information and rich metadata with their submission to the sequence database, BioProject and BioSample. Might help to answer the questions of how NCBITaxon is used now in response to one of Chris' comments.

Microbiome Database (MDB) Chinese equivalent to EBI MGnify, they are using the GOLD vocab to organize their terms, and the NCBITaxon for phylogeny. I think I should probably mention this in a sentence after EBI MGnify. here the link to the preprint for CNGB Sequence Archive (CNSA) CNSA: a data repository for archiving omics data for archiving omics data, including raw sequencing data and its analytical data and related metadata ... Complying with the data standards commonly used in the life sciences, CNSA is committed to building a comprehensive and curated data repository for the storage, management and sharing of omics data, improving the data standards, and providing free access to open data resources for worldwide scientific communities to support academic research and the bio-industry. Looks like China National GeneBank (CNGB) is their equivalent to NCBI/ENA and MDB to MGnify.

MAYBE

The Microbial Antarctic Resource System: Integrating discoverability and preservation of environmentally-annotated microbial 'omics data, mars site Follows MiMarks template maybe worth mentioning?

Using MIxS: An Implementation Report from Two Metagenomic Information Systems talks about EMPO/ENVO EMP ontology (EMPO), that captures the primary axes along which microbial communities tend to be structured (host-associated or not, saline or not). EMPO is an application ontology, with a formally defined W3C Web Ontology Language (OWL) document mapping to existing ontologies, enabling reuse by the microbial ecology community. Could be some fuel to talk about their issues, Also talks about Ecobiomics, should maybe include this?

SeqDB: Biological Collection Management with Integrated DNA Sequence TrackingSeqDB tracks sampling metadata, DNA extractions, and library preparation workflow ... SeqDB implements ... the Genome Standards Consortium (GSC) Minimum Information about any (X) Sequences (MIxS) specification

Ecobiomics: Environmental metagenomic biomonitoring Maybe try to follow up on this for the review paper??

Qiita: rapid, web-enabled microbiome meta-analysis [QITTA]... requires that new studies include a description of the work; relevant publications; collection and processing parameters for each sample; and relevant covariates, based on the MIxS standards.

Meta-omics data and collection objects (MOD-CO): a conceptual schema and data model for processing sample data in meta-omics research FOG and Pelin on this, seems relevant (maybe more for paper2? or maybe for this...).

Standardized Metadata for Human Pathogen/Vector Genomic Sequences

Training Biomedical Researchers in Metadata with a MIBBI-Based Ontology generic lightweight ontology based on the Minimum Information for Biological and Biomedical Investigations (MIBBI) standard

Clarifying Concepts and Terms in Biodiversity Informatics2013 ramona and john

Things I'm unsure if we would want to mention

Measures for interoperability of phenotypic data: minimum information requirements and formatting Minimum Information About a Plant Phenotyping Experiment (MIAPPE) like MIxS maybe mention?

MIxS-Cryo: Defining a minimum information standard for sequence data from the cryosphere from Jose's thesis. Not published but should we mention it and cite the thesis?

Minimum Information about ... papers:

Minimum Information about an Uncultivated Virus Genome (MIUViG)

Minimum Information about a Biosynthetic Gene cluster

Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

maybe more about biodiversity/BCO:

PlutoF: Biodiversity data management platform forthe complete data lifecycle

TDWG/Similar abstracts:

The Earth Microbiome Project: Planetary-scale systems ecology.

Linking Biodiversity Data with the Biological Collections Ontology Ramona

Standards-based management of specimen-derived DNA sequences SeqDB

Genomic Biodiversity Working Interest Group (GBWG) John Deck

Genomic Biodiversity Working Interest Group (GBWG) Meeting John Deck

Using the Biological Collections Ontology to Advance Biodiversity Science Ramona

Metadata Standards for Genomic Sequence Data: Past and Future of MIxS Standards Family Pelin 2017 probably not super necessary to cite or is it? While broadly accepted and used by the microbial (ecology) research community, MIxS has several shortcomings, as well as areas that require further development. The GSC is committed to engaging domain experts, in order to: (i) expand coverage and breadth to accommodate new data types and emerging technologies, (ii) maximize usability, (iii) expedite further evolution according to community needs, and (iv) automatize update of MIxS. So maybe some material here?

Genomes OnLineDatabase (GOLD) v.5: an improved metadata management system based on a four level (meta)genome project classification poster so might not be able to cite, but it claims GOLD is fully compliant with the Genomic Standards Consortium (GSC) Minimum Information about any (x) Sequence (MIxS) standards (see Reddy, T. B. K. et al.The Genomes OnLineDatabase (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification. Nucleic Acids Res.1–6 (2014))

Clone this wiki locally