Adapt / Extend ISA Sample Schema #93
Closed
CS76
started this conversation in
Data Schemas
Replies: 2 comments
-
Considering ISA application in MetaboLights, and our attempt to apply it in Chemotion and NMRShiftDB datasets, we would like to suggest the following:
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Currently we are focusing more on schema.org |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
ISA is a metadata framework to manage a diverse set of life science, environmental and biomedical experiments that employ one or a combination of technologies.
ISA specification defines an abstract model of the metadata framework and journals such as Nature ScientificData (https://www.nature.com/articles/sdata2018179#Sec27) and Oxford GigaScience (https://academic.oup.com/gigascience where the GigaDB stores the data/meta-data of data described in the article following the ISA model) accepts datasets as ISA.
Built around the 'Investigation' (the project context), Study' (a unit of research), and 'Assay' (analytical measurement) general-purpose JSON format, we (nmrXiv) would like to comply with ISA Schema specifications to capture NMR metadata (Assay, Samples and Ontologies). This helps us provide a detailed description of the experimental metadata, both synthetic and biological (i.e. sample characteristics, technology and measurement types, sample-to-data relationships) so that the resulting data and discoveries are reproducible and reusable.
While the ISA Sample schemas are widely used to capture biological samples in databases such as MetaboLights and GeneLab; we (nmrXiv) would like to extend this and create a few mock-ups of synthetic samples (from Chemotion/NMRShift DB) mapped to the ISA schema.
We aim to develop sample configurations (synthetic and biological) derived or compliant with ISA specifications and use them to capture rich metadata (both human and machine-readable) in nmrXiv. Please find below more details about ISA samples and a few mock-ups of sample metadata (JSON). Let us know your ideas/Views or comment on potential pitfalls or blockers.
ISA Sample schema details:
In a Study object, ISA record the provenance of biological samples, from source material through a collection process to sample material, represented with directed acyclic graphs (direct graphs with no loops/cycles). The pattern of nodes is usually formed of a source material node, followed by a sample collection process node, followed by a sample material node.
(source material)->(sample collection)->(sample material)
These study graphs MAY split and pool depending on how the samples are collected.
In a splitting example, multiple samples might be derived from the same source:
(source material 1)->(sample collection)->(sample material 1)
(source material 1)->(sample collection)->(sample material 2)
In a pooling example, multiple sources may be used to create a single sample:
(source material 1)->(sample collection)->(sample material 1)
(source material 2)->(sample collection)->(sample material 1)
However, sample collection applies only to biological samples were the source comes from a certain organism, while with synthetic samples we have a certain compound that gets dissolved in a solvent to create the sample. We are trying to capture this concept in ISA adopting mock-ups derived from actual samples/molecules found in non-biological repositories such as Chemotion and nmrshiftDB and compare them with the biological sample coming from the biological repository MetaboLights. Please find the mentioned mock-ups here. Those mock-ups are still under development, especially due to the absence of the relevant ISA configurations in addition to the continuous update upon discussion.
Our approach to represent a synthetic sample is to take the compound found in Chemotion or nmrshiftDB as a source, and apply an NMR sample protocol where a solvent is added to generate the sample and where the solvent becomes one of the sample characteristics. This NMR sample undergoes a NMR spectroscopy protocol with parameters such as the instrument, magnetic field strength, pulse sequence name, temperature and others. This protocol will result in a data file which in return undergoes an NMR assay protocol with some NMR processing software.
However, we are still facing issues regarding:
Beta Was this translation helpful? Give feedback.
All reactions