-
Notifications
You must be signed in to change notification settings - Fork 0
fall_2019_log
August 2019 continued
September 2019
Sep.05, Sep.06, Sep.09, Sep.10, Sep.12, Sep.24, Sep.25, Sep.26, Sep.27, Sep.30,
October 2019
Oct.01, Oct.15, Oct.20, Oct.21, Oct.22, Oct.23, Oct.24, Oct.27,
November 2019
Nov.06, Nov.12, Nov.15, Nov.19, Nov.20, Nov.21,
Link to BE529 wiki page
Graduate advising say's it's not necessary for the PI to be last author on every paper counting toward the thesis.
From Matt Bomhoff script to test if a data package is valid according to FD
From conversation with Alise about having multiple OBO ontology search hierachies for PM. ATM we have biome (not hierarchically displayed but I'd like to get there), It'd be nice to have ENVO material entity and environmental feature as well, then extend into other such as GO (a search bar for all three hierarchies: mf, cc, bp) and or the NCBI taxon ontology. It would have to be different then just the good old Taara Ocean Gene Atlas which enables users to blast search against Tara oceans.
Could get the Interpro terms from the EBI's functional analysis pipeline which gives the the annotation at the read level for example SRR1182511.1003639_HWI-EAS165_0077_FC70822AAXX:1:33:12777:2046_1_1_150_-
SRR1182511 is the accession number and the -
at the end is the reverse read etc. use interpro2go to get back to the GO terms. It's a little silly because the EBI pipline uses interpro matches with genome coordinants and uses the interpro2go mapping to get the GO terms. Hence we'd precompute both probably, then when the user uses the GO search bar, it would have to dynamically get the relevant purls (like with all of the search bars) then take those purls, run them against the interpro2go mapping to get the Interpro annotations which match those GO terms. Then it would take that list of terms (for each sample) and go lookup the rows in the table which have those interpro terms, from which it would get the pre-computed abundances as well as the genomic the coordinants then get the DNA sequences.
Could potentially get the NCBI taxon ontology purls by running everything with centrifuge which returns NCBI taxon terms such as 9606 homo sapian, which should be easily matched to the purl with the same ID eg: http://purl.obolibrary.org/obo/NCBITaxon_9606.
Alternatively if we were to use the EBI's taxonomic pipeline which uses silva, we could maybe use this paper about Analysis of 16S rRNA environmental sequences using MEGAN which created a mapping file that maps SILVA accession numbers to corresponding NCBI taxonIDs
it's pretty old, however, 2011 so probably more out of date than centrifuge.
Dynamic reports with knitr R tutorial from jcoliver
OBI issue 1051 connected to PMO.
email with scholarship applications
I'm not eligible for the National Science Foundation Graduate Research Fellowships (NSF GRFP) as I have a masters degree.
Possibilities:
The Carson Scholars Program is dedicated to training the next generation of environmental researchers in the art of public communication. In addition to this training, scholars will also receive a $5,000 scholarship for their participation and completion of the program. The program is open to UA students in PhD, JD or terminal master’s creative programs in areas concerning the environment and/or its intersection with social justice. Questions? Please email Liz Marsalla at [email protected].
Deadline: September 27, 2019
Citizenship: Unrestricted
Website: https://carson.arizona.edu/
I'm applying now to try and cover 2nd year summer. I Emailed Liz to get the process started.
The American-Australian Education Fund provides scholarship awards of up to $40,000 to support graduate level study or post-doctoral research in two directions: Australian Citizens travelling to the US and US Citizens traveling to Australia. Studying or conducting research within the fields of science, technology, engineering, mathematics, sustainability, medicine and health sciences (scholarships also available in Indigenous Education, the arts, and for Veterans)
Deadline: October 15, 2019
Citizenship: US or Austrialian
Website: https://www.americanaustralian.org/page/EduGradRounds1and2
Could maybe do something with Simon at CISERO in the future? 4th year or some post-doc?
The SCGSR program supports supplemental awards to outstanding U.S. graduate students to conduct part of their graduate thesis research at a DOE national laboratory/facility in collaboration with a DOE laboratory scientist for a period of 3 to 12 consecutive months—with the goal of preparing graduate students for scientific and technical careers critically important to the DOE Office of Science mission. Graduate students currently pursuing Ph.D. degrees in areas of physics, chemistry, material sciences, biology (non-medical), mathematics, engineering, computer or computational sciences, or specific areas of environmental sciences that are aligned with the mission of the Office of Science are eligible to apply.
Deadline: November 14, 2019
Citizenship: U.S. citizen or permanent resident
Website: https://science.energy.gov/wdts/scgsr/
Could do this potentially later (like if my 4th year isn't funded) with Chris' group or maybe even Elisha? This was what Elisha had forward to me last year. Chris was interested and I said I'd like to but would maybe delay it as I need to be in Tucson taking courses. This could definitely fund my 4th year. November this year but last year was May so keep an eye out on it. Need to be a PhD candidate to apply. So doing this in my 4th year would make sense (assume Bonnie won't have money for me then).
DAAD provides a variety of educational exchange scholarships Germany. The Research Grant is specifically for PhD students wishing to pursue research in Germany. The Study Scholarship provides resources for independent study in Germany. The University of Arizona has been invited to nominate one graduate student for a DAAD Research Grant. If interested, please contact Shelley Hawthorne Smith ([email protected]).
Deadline: Varies
Citizenship: Unspecified
Website: www.daad.org
Could maybe do this and get back to Bremen if need be (not my first choice) or maybe Pier Bork's group could be amazing but if Bonnie can't pay me. I think I remember there being a restriction on haven't been in Germany in x years, so have to check if/when I'd be eligible.
Research Grants - One Year Grant (7-12 months) Applicants may not have been living in Germany for more than 15 consecutive months by the time of application deadline. So I think I'd be ok, unless that means a total of 15 months. not sure.
Research Grants - Cotutelle Doctoral Programs Dual supervision in Germany/home country. Could maybe try to do this and be also supervised by Pier or someone back in Germany and get the money/travel grants. Same 15 months thing as above would need to clarify. I've emailed for clarification. This could be cool to be officially co-supervised and get DAAD money for it.
Short-Term Research Grants same eligibility stuff.
The BAEF encourages applications for fellowships for advanced study or research during one academic year, at a Belgian University or institution of higher learning. The B.A.E.F. will award up to ten fellowships as outright non-renewable grants carrying a stipend of $28,000 for Master's or Ph.D. students and $32,000 for Post-doctoral Fellows.
Deadline: October 31, 2019
Citizenship: U.S. citizen or permanent resident
Good to know for post-docs Don't know anyone currently collaborating in Belgium.
The Foundation supports graduate students working towards the Ph.D. degree in the applied physical, biological and engineering sciences. These fields include applied mathematics, statistics, and quantitative aspects of modern biology. Applicants must be willing to morally commit to make their skills available to the United States in time of national emergency. TROLL of course, worth looking into thought may be eligible
Deadline: October 23, 2019
Citizenship: U.S. citizen or permanent residents
Website: http://www.hertzfoundation.org
Looks like I'm in-eligible as I'm in 2nd year of my PhD.
Genomic Education Alliance GEA students learning about genomics gene annotation. Contribute manually curated gene models.
Train students and apply bio-inf tools to genomic research.
NCBI Blast gets easily throttled and can't search the unpublished datasets.
Parameters: 20 students, 5 different searches, want results to come back at aprox same time. Cashing BLAST results.
training scenario: 100 students concurrently running blast.
use the sequence server code base, in ruby ... yikes. Have to find out where in their ruby back-end to play around and change where necessary calling NCIB's blast, add the necessary steps to make it run more smoothly (scale to the users specs) or be sent/ran somewhere else.
wurmlab/sequenceserver github page and sequenceserver page
PMO issue prevent me from making releases. I've preliminarily dealt with OBI:comment on investigation in this issue, but were waiting for that to get released, during the Make process for ontologies created by @cmungall's ontology-development-kit (such as PMO), it runs the line:
robot convert -I http://purl.obolibrary.org/obo/obi.owl -o mirror/obi.owl
pulling from the latest. As the changes haven't been released yet, I did check they are in the OBI editors version, ENVO will keep pulling the OBI latest, which PMO then pulls from and adds in that unsatisfyable equivalence axiom of OBI:comment on investigation
.
Similarly, OBI is injecting the subclass of disposition
axiom to PATOfluorescence
. which is causing more reasoning errors for it's subclasses. When you just manually delete the problematic axioms from PMO edit, they'll get re-imported during the make process, i.e. calling ./run.sh make prepare_release
.
What I did in the past to circumvent this was when running the make file commented out the lines:
#imports/%_import.owl: mirror/%.owl imports/%_terms_combined.txt
annotate --ontology-iri $(ONTBASE)/$@ --version-iri $(ONTBASE)/releases/$(TODAY)/$@ --output $@
#.PRECIOUS: imports/%_import.owl
Removing the axiom from OBI:comment on investigation
commenting out those lines and running ./run.sh make all_imports_owl
(that just gives make: Nothing to be done for 'all_imports_owl'.
)
doing this and running: ./run.sh make prepare_release
! success, the comment on investigation issue is resolved!!!
Plan of action, keep the imports commented out for now, as these OBI injected axioms are just causing trouble for now. Will uncomment later once the comment and fluorescence issues are resolved in the release versions. For now I can delete the problematic axioms and just stop re-importing them. No better yet:
Plan of action. Every time I want to make a new PMO release I run ./run.sh make all_imports_owl
then comment out the lines from the make file, then manually delete the problematic axioms: equivalence axiom in OBI:comment on investigation and subclass of disposition
axiom to PATOfluorescence
), then run ./run.sh make prepare_release
.
Not ideal solution but it'll work for now and allow me to keep making releases while waiting for the OBI issues to sort out. Can track the status of the PATO fluorescence issue here.
INFO class:
get a chameleon cloud account
ingress egress (in out on the cloud)
ipv4 is running out of space and is mostly privately owned.
OpenRefine neat google application for cleaning up messy data.
Working on Rachel Carson Scholarship application
at Digital Scholarship & Data Science Fellowship information session. Jeff Oliver I think it's this guy. training in Pedagogy (learning and teaching).
Goals
- pedagogical best practices how ppl teach and learn.
- workshop series which fellows create and run.
- create open source material for others to learn.
Library staff not providing technical resources for the scientific problem, but will give workshops on pedagogy.
Application Due Oct 15th: statement of intent: skills to pursue, application to my research, interest in broader community and what perspectives I'd bring to the program.
Only 1 hour weekly Friday (time TBD) meetings required, other 3 hours are self-directed learning (should double dip with thesis objectives). 4 Pedagogy workshops guarantied. Rest of sessions are check-ins, then in fall we give a workshop.
Cohort of 5 fellows.
INFO 523 github page for Matt M and I's hw.
Bonnie wants agenda 24hr before 1:1 and PM meetings. Wants more todo's on basecamp She will post a grad template for graduation including example proposals. For proposal oral exam only 15 minute presentation, but lasts an hour with questions and interruptions. hour2 anything that they want to ask broadly answer any number of questions. Metagenomics contigs whatever they want. How are communities assembled etc.
For BE_529:
trying to launch Ubuntu 18_04 NoGUI NoDesktop Base
from cyverse atmosphere it worked takes a bit to deploy.
LOG IN TO VM (From Command line):
ssh (cyverse username)@(ip of vm)
sudo apt-get install gem
sudo apt-get install ruby-full
``sudo gem install sequenceserver
install sequenceserver
sudo gem install sequenceserver
------------------------------------------------------------------------
Thank you for installing SequenceServer :)!
To launch SequenceServer execute 'sequenceserver' from command line.
$ sequenceserver
Visit http://sequenceserver.com for more.
------------------------------------------------------------------------
press enter
mkdir blastdb
pwd /home/kblumberg/blastdb
found an example fasta from here: http://prodata.swmed.edu/promals/info/fasta_format_file_example.htm
From Saatish:
so once you launch seqserver.....ipaddress:port# should get u the interface
Navigate to: 128.196.142.140:4567
(while seq serve is running on that and it works!
suspend the image when finished with it.
Continuing the BE_529 hw midterm project
get a blast DB from: https://ftp.ncbi.nlm.nih.gov/blast/db/ or ftp://ftp.ncbi.nlm.nih.gov/blast/db/ coordinate with the other groups and set this as the blast db for seq serve. Then go get and trim some fasta files to be of the benchmark sizes.
From Chris Mungall invitation to OBO-Tools mailing list
Talk (Karthik Ram UCberkley) on getting DRYAD datasets linked to computation resources. my binder.com (something like that) R open Sci (something like this).
Talk by Ashley Asmus https://nutnet.umn.edu/ suggested this to Alise.
meet discussed with /lookinto Hilmar Lapp does some ontology work.
Talk by Danie Kincade (operational co-pi with Adam Shepard) BCODMO (NSF arranged marriage). Heavy lift them is I (from FAIR). Researches can find big picture of data, maybe link to NCBI. Take data in propriety formats but outputs c/tsv. PAR public access to research.
http://environmentontology.org/annotation-guidelines
NMDC Ontology Workshop Agenda (final) Oct 2019
NMDC video from Pier explains envo being a machine readable semantic resource quite nicely.
NMDC google drive see Chris' Mungall-NMDC slide deck.
My ongoing EBI_MGnify_Biome_ENVO drive link.
New link NMDC_workshop_ENVO_MIXS_GOLD_environmental_examples for the example 3-slot MIXS annotation of select ENVO terms.
More maybe relevant links tax_e page with links to issues I've encountered, termite gut envo issue, gold page
DataOne make data count track citations for datapublications
http://bit.ly/NMDCwrkshp1_agenda
http://bit.ly/NMDCwrkshp1_notes
every 4th monday GSC meetings hosted by Ramona Walls
Lorna Richardson EBI MGNIFY contact (check out new paper on newest version of tools use silva remapped to NCBItaxID)
get the contact from gentleman from GOLD. [email protected]
Quitta (Knight lab) neat tool
MicrobiomeDB Jie Zheng.
Part 2: Ontology usage in MIxS standards (facilitator guide)
GenomicsStandardsConsortium/mixs-legacy
Meeting with EBI/JGI folks: Action items: for EBI/JGI project: make a skos type rep of of the spreadsheets make computable see how that renders, email everyone to coordinate.
TODO:
follow up emails:
-
email Adam Arkin/Chris Elisha...everyone about DOE funding opportunity EMAIL SENT
-
email EBI/JGI folks Lorna and Reddi [email protected]. EMAIL SENT
Get Jie Zheng's email: have profile tried their email contact system but not sure if it worked follow up.
Hello Dr. Zheng,
It's Kai from ENVO, thank you for the feedback these last couple days, I would like to stay in touch in order for us to collaborate on ENVO and OBI. You had mentioned that OBI needs some in-door environments, materials or similar. I had also mentioned that I would like to make some OBI contributions, especially regarding the measurement device hierarchy.
You had also mentioned you have a publication which talks about managing imports within an application ontology. Would you mind sending me the title of this publication?
Thanks again for your suggestions and advice! Looking forward to collaborating further.
Cheers,
Kai
- Lynn Schriml [email protected] EMAIL SENT
MIXS ontology idea, making purls, or importing relevant ontology branches to have MIXS fully semantacized. Follow up. Also ask her the history of how the got to the MIXS envo triad for annotation, will help with the NMDC Gold paper - knowledge representation issue of how best to represent such data.
OBOE: The Extensible Observation Ontology
ACIC class:
From Tanner
submit makeflow jobs seq serve github page has the 2.0 beta which has a native pooling and queing process. Instead of queuing blast, queue makeflow command let sequence serve queuing system handing queuing of jobs
change sequence serve code to make makeflow inside source directory then ... some path there's something like pool.rb find where job is added to pool, instead of command add makeflow to pool. job.rb @command method, elsewhere in pooling command send to makeflow then send makeflow to pool.
Don't have to install blast on worker nodes, but maybe hard code bath to blast. DOn't tell work queue about the DB's workque wants to work on relative paths, don't specify db files as input to work queue, db's have to be in same place on a machines. Command submit to workque has to be path strict, the i/o files without path, where does work queue expect stuff to be
From Sateesh
For the workqueue scirpt , Sateesh mentioned building it off from the script covered in class, tasks, queue.submit can take form the class material and modify it for what we need
come up with way of starting workquee master when seq serve starts to keep the workqueue master running. The way the workqueue master is set up that script has to submit the tasks.
ACIC Midterm Project Collab google doc
pbs_submit_workers, Tutorial: Building Scalable Applications with Makeflow and Work Queue, pbs_submit_workers, workqueue-tutorial and makeflow-tutorial, workqueue manual
EBI pipelines:
geopackage and OGC GeoPackage Encoding Standard and wiki, which are similar to the OBO frictionless idea, need to show how it's different. Having the multiple csvs inside the json object...? Different semantic layers. Theirs uploads into an sql database.
BE 529 Final: maricopa crops mentioned crop ontology, and I think Agronomy Ontology, could add making an application ontology importing from AGRO/ENVO/Plant Trait Ontology(TO) for their metadata as a supplemental to the project.
Discussion with Kitt and Dava:
section 7.2 of grad manual confused me with at least three faculty members who represent the major subject area and one or more faculty members who represent the minor subject area.
They clarified only 2 of them need to be BE professors 1 can be some other dept but relevant to the work such as Dr. Cui. Plus minor advisor Dr. Morrison.
TA positions:
-
0.25 FTE TA opening next year Fall 2020/Spring 2021 for BE 170A1. prefer to have a TA who is committed to doing both semesters (not just one or the other) backup if DOE internship isn't funded.
-
MIC 205A - General Microbiology runs Spring Summer and Fall semesters always looking for TA's I would be qualified with my masters. Patricia Stock is the ACBS direcotor which this class falls under MIC in CALS.
-
3rd potential option suggested by Dr. Carini: talk to CALS dept head David Baltrus there an undergrad genetics class which is usually needing more TA's.
Potential Elective credits: BE 592 Directed Research or BE 593 Internship can be 1-6 units latter's descrition: Specialized work on an individual basis, consisting of training and practice in actual service in a technical, business, or governmental establishment.
Could use this as last elective credits while doing the DOE internship.
Meeting with Ramona: tution afer quals?
CGIR miapi min infor plant phenotyping exp ask ramona, similar 3 to 3 part get as citation for the concept.
try to import BCO let Ramona know if it doens't work. I need classes like observing process.
look into data properties
min/max depth in meters from Darwin core, can get it from BCO.
Meeting with Ramona and Ontology reading group: metagenomics microbial community omics review paper
MIxS (pronounce mix S) metadata properties (like darwin core) roots in sequence of microbes and metagenomes.
OBA ontology for biological attributes derived from UBERON for metazoa. TO is Plant traits. Maybe read a paper on this.
GOcams next step in GO, graph of GO classes and associations. Ramona will send link. Read paper on this.
semantic similarity done originally with GO. Touch on this as well. Paper?
Design patterns from Chris. Read some of these.
Discussion of BCO PAPER
-
observation: just information artifact
-
specimen collection -> generate material thing.
Darwin semantic web, how the classes in Darwin core relate but doesn't relate to other ontologies hence linked it to BFO for BCO. Research Coordination Network RCN to coordinate Darwin core and GSC. ABCD and darwin both tadwig standards
Ranoma needs a more updated BCO paper how to use it to model data. PPO plant phenology ontology pipeline to make graph db. integrate biodiversity data at scale. Futures vertebrate traits
Ramon build Identifier services proof of concept managing distributed data using semantics. Built simple data model, process input or output material entity or data, can describe any biological Research project, process is central key, If org structure is process workflow gives generalize-able data-model.
phenopackets paper similar idea to datapakges daata about phenotypes, exchange info as packet. , they infer facts
VertNet aggregator of vertebrate data sends to GBIF, dawrin core archive XML syle directory with csv.
OGC geopacgkes, netctd, phenopackets Marine Laporte: plant traits (need to have name of trait value method unit) my idea is like that but with 3 for environment.
The proposal with Chris would presumably focus brining the FAIR principles, specifically Interoperability to action for 'omics metadata within the context of the NMDC AIM1 — Designing NMDC-compliant metadata standards, leveraging existing ontology mapping software and curation resources to enable automated annotation of standardized metadata. Objectives (in progress feel free to interject) would include:
-
Harmonizing the GOLD/EBI-MGNIFY hierarchical controlled vocabulary with a MIxS style representation ENVO representation.
-
Working toward the creation of an RDF version of MIxS by extending ENVO/OBO ontologies as necessary to have provide iri's for MIxS fields as well as creating ENVO subsets for individual packages and brining in other controlled vocabularies as necessary.
-
Finalizing my work on the Planet Microbe (PM) project, an experimental cyberinfrastructure system which uses ENVO and other OBO ontologies to standardize marine 'omics project metadata. PM currently enables semantic search on metadata, and ongoing work involves leveraging knowledge contained within genomic ontologies (such as GO and NCBITaxon) to guide users to specific information relevant to their biological questions. Planet Microbe could serve as a model for how future systems (NMDC or other) could enable deeper incites to be derived from unified meta 'omics data. With a future system leveraging the semantic harmonization from objectives 1 and 2.
2020 Community Fellow Telecon Notes
register for full esip meeting
Participate in Community Coins Program slide about ESIP CF program in regular talk university.
ACIC class final project github repo
1:1 with Bonnie:
Beginning of semester try to get a committee meeting.
Get proposal ready march 1st, also try to get individual time with them to ask questions.
ready to start written exam try to schedule date
committee will give ok, give them draft they have 2 weeks to review
have 2 weeks to address comments/make revisions
resubmit final proposal which gets graded
committee get 2 more weeks reviewing final copy and they make decision on written exam
Once passed written, can request dates for oral exam. usually 1-2 weeks after written.
Schedule 3 hour time window for exam (usually 2 hours)
Potential questions for Dr. U'Ren morrands eye?, equations for stats techniques
Bonnie will ask q's on metagenomics metatranscritoms, role of semantics
Dec.06
CHEBI help email: [email protected]
Dec.12
terms missing from CHEBI preventing us from having envo/pmo purls for TARA:
Phaeophorbide a
Prasinoxanthin
Diadinoxanthin
Phaeophytin a
Diatoxanthin
Chlorophyll c3
Dec.13
library for doing CLI operations at the RDF level: https://graphy.link/cli
Jan.03
Have a chapter of my thesis be the kmer-based retrieval issue. Compromise as Bonnie wants me to do it but I don't have funding from her to do, so I'll have it be a small chapter in my thesis. What I ran this summer is an ok start but I was doing it on the login nodes of TACC, instead I should either properly do it on TACC, or perhaps on a cyverse atmosphere ubuntu cloud image. The last version used various different parameter sizes which I didn't select for as systematically as I should have. Next time just use the default settings and vary the Kmer sketch size. I prepared the majority of the pipeline steps as individual components without stitching them together. The next version should be a solid product start to finish. A single github repo with everything required. The Data/metadata file with the accessions to get plus their labels, scripts to: parse that file and dowload/unpack everything, install simka/mash/centrifuge (which ever I use), then to run the jobs with the various kmer sizes writing out the outputs, Then a section to ingest/analyze the data (compute the tringular matrix) then get the retrieval times and accuracies. Maybe also some runtime mem usage etc like what Tanner did in the CI class. Finally (at the begining) documentation on the type of cyverse cloud used and the install notes, so that the whole thing could easily be reproduced by launching the cyverse/jetstream/whatever cloud and just ./run the whole process.
ESIP:
schedule, CF schedule, WM20 ESIP Meeting Take Aways
WM20 ESIP Meeting Community Fellow Guidance
Plenary thoughts:
Nadine Alameh: developers moving from xml to JSON to more general API's rather than just geospatial apps. Having to deal with legacy systems while technology rapidly changes, old aviation tracking systems for 1000's of airplanes vs the new systems for tracking millions of drones. Sounds like they just built the new one from scratch.
checked out OGC geopackages again 25 years old now. Data deploys into SQL type Database. see quickstart here and specs
Paco Nathan: Evidence based decision and policy making. -> Research data centres. knowlege graph of metadata about dataset useage. For a given dataset who's used it what methods did they use what results did they get. What is the impact of datasets on policy. Trying to infermetadata then give back to researchers (via research gate) to validate. JupiterLab MetadataExpolorer browse knowedge graphs check this out. AI optimizing gradients. Gap between have and have nots companies not investing in AI. Math is funky not always finding local minima using deep learning methods (strange). Results not always reproducible (maybe hitting diminishing returns). testing and training large models have large carbon footprints. Semi supervised and active learning. Many companies making knowedge graphs about dataset usages. Now hardware is in renaissance moving faster then sofware (faster than process). Evoluation of cloud patterns (paper to read) from Berkely.
"Earth observations are not longer the pervue of governments" meant in a positive others are getting into the space not just neoliberalism has won.
FAIR Metadata Recommendations session
Report on work doing fair in: Datacite, ISO, EML. Checks and implementation in the dialects. Google doc for notes: , git repo: NCEAS metadig checks. Fundamental characteristics what's needed to make data fair. Suite of Fair checks, atomic checks to see if metadata elements are present or are sufficient. DataOne 45 data repos ~1 million deatasets.
Reusability is typically low throughout the data repositories.
NCEAS is continuing to work on pinning down what are the fundamental characteristics for FAIR data.
Working toward DataOne plus based on community agreed set of fields.
Ran into Joan from ESS Dive, which is a DOE project. Perhaps making terms for this would be a good use-case to apply for the DOE fellowship with Chris.
From Stephen: scott peckham's Geoscience Ontology - Scientific Variables Ontology SETLr: the Semantic Extract, Transform, and Load-r like the Design pattern to make semanticweb/owl from a json file. Analogous to the yaml design patterns.
From Chuck: r2rml for relation database tables Map relational table to URI. csv w in thereory frictionless data is a fork or draws from this.
ESIP Geoscience Community Ontology Engineering Workshop (GCOEW) session:
Open forum for semantic harmonization. Identify future semantic harmonization activities. Open earth and spaces sciences similar to OBO foundry for earth systems.
Harmonizing sweet with ENVO.
At AGU they added a convention for changes to ontologies. Class level annotation convention. Can get now get textual defs from DBpedia for SWEET terms.
aligning with W3C time and PROV-O as well as the NERC vocab server are important future goals. Could be what I work on with Simon if I can get the australian fellowship. PROV-O @lewis thinks Simon is the one to lead that.
https://github.com/ESIPFed/sweet/wiki/SWEET-Class-Annotation-Convention
Jan.09
g2p2pop Research Coordination Network can gegt $3000 to do an exchange between labs. Apply for this if going to BCO-DMO/NCEAS.
Global Change Master Directory (GCMD), Common Metadata Repository (CMR) | Earthdata