-
Notifications
You must be signed in to change notification settings - Fork 0
planet_microbe_paper_3
Use tarql for tsv to rdf, and instead of blazegraph use tdb2 and the most recent Fuseki as a front end for data management. It's easy and you can include shiro for security if we want to go public.
Prokaryotic WGS only
Amazon continnuum 100 total ( plume 50 samples, river 50 samples)
BATS 63 (smaller files too might not want to use)
CDEBI 20
GOS 70 (very low quality might not want to use)
HOT Chisholm 70 (smaller files too might not want to use)
HOT DeLong 40 (smaller files too might not want to use)
HOT Timeseries 460 samples
OSD 162 (smaller files)
Tara's GO term csv files are ~ 180 KB each. Tara's APY (prokaryote shotgun sequencing) and it only 136 samples. (Tara polar is only virus).
Thus in total we have 1121 prokaryotic fraction samples.
multiply this by the over-estimate of 180 KB for each GO file: which is ~200 MB
The metadata file is on the order of 2kb definitely less then 10.
He setup some scripts for running/loading Blazegraph here if you decide to go that route: https://github.com/hurwitzlab/planet-microbe-scripts/tree/master/blazegraph
And here it is hosted: https://www.planetmicrobe.org/blazegraph/ (console) https://www.planetmicrobe.org/blazegraph/sparql (SPARQL endpoint)
Using GO/NCIB taxon ontologies to "learn more from omics data"
Frontiers special edition: Microbial Response to a Rapidly Changing Marine Environment: Global Warming and Ocean Acidification.
Abstract due in June manuscript October. could be a place to put the 3rd pm paper as they deal with
• Physiological responding and metabolism.
• Identification, community structure, and biodiversity.
• Quantification of microbial biomass and productivity.
• Microbial-mediated biogeochemical cycling.
• Biological response and feedback.
Scalable Bioinformatics: Methods, Software Tools, and Hardware Architectures june/october deadlines. could be cool for paper 1 or 3 maybe?
CQ1 “Does the new method give comparable results to those published in the literature when re-analyzing published studies analyzed with different methods?”
re-analyze phyla level taxonomy relative abundance from Figure 2 or 8 from (Sunagawa et al. 2015), and possibly the high level functional relative abundance from figure 8, they seem similar to to GO terms could be mapped perhaps.
shortcoming only looking at whats known in cent databasae not like with OTU's where you can get at unknown. Same with GO only gets at the known make question about Thaumarchaea give Thaumarchaea as example known aquatic ammonia oxidizer expect to see effects of nitrate, nitrite, and ammonium but testing everying on their distribution.
rcca
comparing functional potential of samples from different ecosystems.
Although not as commonly discussed in microbial ecology, macroecology studies have refer to potential for studying functional beta diversity (Swenson et al. 2012). One study, directly explored the concept of functional beta diversity to compare the carbon degredation capcity of various freshwater lakes lake bacterioplankton (Dickerson and Williams 2014). Similarly, the addition of hydorolytic enzymes has long been used to test and compare the functional metabolism of particular hydrolytic activities (Burns and Dick 2002; Boetius 1995).
add this it only does what you can annotate so it's not true diversity only known functional diversity.
Energy metabolism, is one of the most crutial functional metabolic processes in which living cells engage. ... REDOX metabolism. ... . This new method can help expedite this process. Compare across avaible dasets made interoperable across dataserts annotated with ENVO terms (search envo heirearch).
architechure here would allow us to more easily answer these questions and Here's another method to investiage these
New: insights from using multiple datasets, questions about specific metabolism. ... maybe qualities too?
Method PCoA is a common ordination technique for beta diversity in microbial ecology (Knight et al. 2018). This new method (USING GO) enables us to study the functional beta-diversity by comparing the relative abundances of GO-specified gene families from various ecosystems.
see mixomics for rcca
This paper is a gold mine for AOA
from wiki: AOA dominate in both soils and marine environments,[2][6][7] suggesting that Thaumarchaeota may be greater contributors to ammonia oxidation in these environments (this paper).
Positive correlations between the abundance of Crenarchaeota and nitrite were observed in the Arabian Sea (11) and the Santa Barbara Channel time series (12) and with particulate organic nitrogen in Arctic waters (13)
Here we report oligotrophic ammonia oxidation kinetics and cellular characteristics of the mesophilic crenarchaeon ‘Candidatus Nitrosopumilus maritimus’ strain SCM1. Unlike characterized ammonia-oxidizing bacteria, SCM1 is adapted to life under extreme nutrient limitation, sustaining high specific oxidation rates at ammonium concentrations found in open oceans.
Another common analysis approach is to look at differentially abundant microorganisms or functional elements (for example, genes and pathways) in the comparison groups of interest (that is, treatment versus control). Identifying microbial taxa that explain differences between communities is particularly challenging because microbiome data sets are high-dimensional (that is, they include thousands of taxa), sparse and compositional.
Perhaps Cite this and talk about how this can automate the identification of microbial taxa that explain differences between environments.
Also has:
For visualizing beta diversity data, ordination techniques, such as principal coordinates analysis (PCoA) or principal component analysis (PCA), are commonly used.
use for CQ2
blog post: on diversity and diversity indices The new synthesis of diversity indices and similarity measures
Phylogenetic and functional alpha and beta diversity in temperate and tropical tree communities (Swenson et al. 2012)
Analyses of functional beta diversity have also become more common with a large sum of work focusing on the development of functional beta diversity metrics that are often implemented in relatively species-poor temperate systems (e.g., Ricotta and Burrascano 2009), with only one study, to our knowledge, being conducted in a highly diverse tropical system (Swenson et al. 2011).
Can reuse this idea for ordination method reduce dimentinaly for large sparse datasets.
Functional Diversity of Bacterioplankton in Three North Florida Freshwater Lakes over an Annual Cycle (Dickerson and Williams 2014)
Study examinig the functional betadiversity of lake Bacterioplankton. The used DGGE, but get at the question of functionla betadiveristy for bacteria.
Overall, Biolog analysis was useful in identifying differences in the functional diversity of bacterial communities between lakes of different trophic statuses and can be used as a tool to assess ecosystem health.
Ideas for potential questions to try and ask:
from this MPI mol-ecol paper they talk about various polysaccharides: laminarin, xylan, chondroitin sulfate, arabinogalactan, and carrageenan.
Go has a cellular polysaccharide catabolic process hierarchy, with some depth/breath, could try using this. or xylan metabolic process or polysaccharide catabolic process //try this
cool idea but probably can't get this info from GO.
signal transduction many subclasses each not super deep has osmosensory signaling pathway
but not super deep.
Not looking super promising.
try inorganic anion transport
lots of redox complexes could be cool to try and answer redox metabolic questions.
NCBI Taxon
From the Tara structure and function paper we can drill down much further into the taxonomic structure.
Also refer against Global patterns of bacterial beta-diversity in seafloor and seawater ecosystems figure 2 comparing the high-level phylogenetic differentation between benthic and pelagic at the coast surface deep etc.
Reffing to the two papers above we could ask questions such as "What Alphaproteobacteria differentiate sea surfaces from deep chlorophyll maxima?" or "What Gamaproteobacteria differentiate deep chlorophyll maxima from mesopelagic samples? or "What is the effect of depth on the distribution of deltaproteobacteria community structure? or "What is the effect of temperature on Cyanobacteria community structure? or "What are the effects of nitrate, nitrite and ammonium concentrations on the known aquatic ammonia oxidizer Thaumarchaea?
from Comprehensive Meta-analysis of Ontology
Annotated 16S rRNA Profiles Identifies Beta
Diversity Clusters of Environmental Bacterial
Communities have a section Salinity as the major driving factor for community assembly?
could be cool to recap some of the hypotheses here for AIM3 and approach the issue they way they do in this EMP paper and drill down into questions about what features affect community (such as proteobacteria) composition
This perspective paper from DeLong and Karl in 2005
has a section about A genomic glimpse into coastal bacterial lifestyles
ask some questions like: Do specific biological properties of coastal bacterioplankton differentiate them from their open-ocean relatives? Can the genomic and physiological properties of bacterioplankton explain in part their observed distribution? Can these different biological features tell us about potential regional differences in the microbial cycling of matter and energy? ... has some examples in the paragraph last one about transporters for uptake of amino acids ammonium urea etc. Could be really cool to try and answer some of these questions.
goes on about questions of ecotypes example about different Prochlorococcus strains isolated from different depths. Would probably need to trace back genes associcated with reads that map to the taxa to be able to answer this. That's why it would be cool if we could keep track of both but maybe too much to ask.
Check out Comparative Metagenomics of Microbial Communities maybe can get some ideas from this? or re-do some analyses or cite for this section?
Bacterioplankton community variation in Bohai Bay (China) is explained by joint effects of environmental and spatial factors
2020 recent paper check out section 3.5 Investigation of the relationship between bacterial alpha diversity and environmental/spatial factors
table 2 is gives Pearson's correlation coefficients (r) between alpha diversity of bacterial community and biogeochemical characteristics of seawater samples could be a cool thing to recreate with planet microbe.
Distributions and relationships of virio- and picoplankton in the epi-, meso- and bathypelagic zones of the Western Pacific Ocean
A bit interesting too in that it discusses (see table 1) the effects of (amongst other things) heterotrophic prokaryotes; Fluo, In situ fluorescence; Proc, Prochlorococcus; Syn, Synechococcus; Temp, temperature; Density, potential density. could be cool to do add this stuff to the Prochlorococcus/Synechococcus story. The effects of xyz environmental vars on Prochlorococcus/Synechococcus distributions. In NCBI taxon there are more named Synechococcus and levels of hierarchy then for Prochlorococcus.
Older 2006 HOT paper but start to try and ask some questions see fig 2 taxonomic distributions of various microbial groups at depth. could follow this up with q's like what Alphas are differentiated by depth.
Pier cites this in him phd thesis (might be his mentor?) can use to cite the idea of multivariate stats being used in micb ecology
Relationships between bacterial diversity and environmental variables in a tropical marine environment, Rio de Janeiro
Check out fig 6 does Correspondence analyses of the microbial diversity and environmental variables does high level taxonomic groups against temp, chl sal no2 no3 etc. Def cite and "redo this at a larger scale.
Microbial planktonic communities in the Red Sea: high levels of spatial and temporal variability shaped by nutrient availability and turbulence
Fig 4 does CCA for eurkaryotes against environmental variables Nitrate Nitrite Phosphate Silicate Temperature and Salinity.
Temporal distribution of bacterial community structure in the Changjiang Estuary hypoxia area and the adjacent East China Sea
Fig 5 Also does Canonical correspondence analysis (CCA) of bacterial communities associated with environmental variables: oxygen depth ph doc nh4 no2
Distribution, Community Composition, and Potential Metabolic Activity of Bacterioplankton in an Urbanized Mediterranean Sea Coastal Zone
2017 study, Figure 5 CCA ordination plot depicting the relationship between environmental parameters and bacterial community structure, as represented by 16S rRNA gene sequence data.
Also has CCA fig 4 of bacterial communites against several nutrients. deals with bloom events.
Spatial Variations of Prokaryotic Communities in Surface Water from India Ocean to Chinese Marginal Seas and their Underlining Environmental Determinants
2017 Also do similar thing NMDS based on Bray-Curtis community distances. Arrows show vector fitting of the environmental variables. Also try to correlations of pro/synnocococus vs latitude. Also give overview of abundant taxa could go along with the Global patterns of bacterial beta-diversity in seafloor and seawater ecosystems paper to desribe overall taxonomic structure globall and in different regions, compare tax structure of coastal regions vs open seas could link this to different envo terms
Core Microbial Functional Activities in Ocean Environments Revealed by Global Metagenomic Profiling Analyses
uses Hot and bats + other sites tries to do funcional analysis with COGS but it's pretty messy still could maybe be some background functional analysis
has section on Motility where talk about how Motility might enable bacteria to achieve spatial coupling with a DOM source. Could maybe try filtering for DOM and looking at it as a gradient and searching some GO motility related genes. Might not work but could try it. Definitely read this during Marmic courses.
Stal, Lucas J., and Mariana Silvia Cretoiu. 2016. The Marine Microbiome: An Untapped Source of Biodiversity and Biotechnological Potential. Springer.
//Could TAKE something from this for the intro: The marine microbiome is not just interesting from a scientific point of view. Certainly, with 70 % of the Earth’s surface covered by the ocean and the ocean probably being the largest continuous habitat, the marine microbiome plays a prominent role in the biogeochemical cycling of elements, is at the basis of the marine foodweb, critical for the ecology of the sea, and essential for climate reg- ulation and counteracting the effects of global change.
Arrigo, Kevin R. 2005. “Marine Microorganisms and Global Nutrient Cycles.” Nature 437 (7057): 349–55.
//It would be cool to cite something from this for the intro maybe: **On a global scale, cycling of nutrients also affects the concentration of atmospheric carbon dioxide. Because of their capacity for rapid growth, marine microorganisms are a major component of global nutrient cycles. Understanding what controls their distributions and their diverse suite of nutrient transformations is a major challenge facing contemporary biological oceanographers. ** (Arrigo 2005) also talks about and cites info on the long standing question of the redfield ratio, microbial metabolism affecting it cites marcel krypers :(. Could cite if I want to try and dig in on the redfield ratio question.
https://www.nature.com/articles/nrmicro1762 and https://www.nature.com/articles/nrmicro1749
IJSEM db https://figshare.com/articles/International_Journal_of_Systematic_and_Evolutionary_Microbiology_IJSEM_phenotypic_database/4272392 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5541158/
A synthesis of bacterial and archaeal phenotypic trait data https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7275036/
BacDive https://academic.oup.com/nar/article/47/D1/D631/5106998