Skip to content
Kai Blumberg edited this page Jul 6, 2020 · 4 revisions

Links

SRC Variables

SRC-data-supplement drive folder

UA-SRC-data, srpdio

SRC Data Suppl. Meeting Notes

Gardenroots drive folder

objectives

  1. Add SRC-relevant concentration classes into ENVO.

May also have some other medical terms following the same design pattern.

  1. build up the SRC application ontology

It will draw other terms from things like the SDGIO ontology to add resiliency, vulnerability, number of households with cellphone service, number of households with med insurance etc. see the Ramirez-IUC-DataScience-attributes spreadsheet for the terms needed.

  1. Work on BCO add terms there

Can maybe add some of the BCO relevant PMO terms. Possibly a measure relating to enumeration of aggregation of organisms hierarchy maybe with a template pattern?

  1. Work on review manuscript

  2. Help with MIxS RDF

Datasets/projects

Internal Use Case

  • Monica's data: (Make sure to get links/find it from ken)

    • Garden soil also has metal chemical concentrations. in plants (garden) soil and (?tap) water. Gardenroots drive folder

      • Soil
      • Water
      • Plant
      • DUST: Ramona said I can ignore this @Monica is that true?
      • TOASK Ken/Monica: Where is this in cyverse or in the drive folder.
      • TODO KAI: make sheets in SRC Variables
    • EJ Screen

    • Census data

      • ACS data Ken Dorsey and I all agreed to look at it to try and figure out what census vars to get, I'll help Dorsey learn some basic R to try and use the tidycensus package to explore the data.

      • tidycensus, example

      • SRC Variables: Vulnerability and resilience sheet

      • Can explore using terms from SDGIO but it doesn't seem to have what we need, would need to add it, I think for sake of time I'll start with adding the terms to srpdio. Ramona's OK TO MAKE IN SRPDIO for now.

    • USGS water data

      • metal concentrations in water, for Monica separate form the garden roots. (should be same metal in maybe another material) @Monica is the correct?
      • TOASK where is the cyverse path, and header info for these?
  • Priyanka's data:

    • Greenhouse Plant data

      • data for plants in mine tailing with/without compost, has bulk tissue samples of plant roots, (vascular) leaves and shoots (ramona to add). for
    • Greenhouse Soil data

      • has metal concentrations in plain compost might have it in mine tailing/compost.
    • Also has RNA-seq gene expression data might not may it to ontology terms.

External Use Case

  • Colorado School of Mines (CSM)
    • Chemical concentration data
      • Cyverse: /iplant/home/rwalls/ua-src-data/csm/water_chem/data
      • SRC Variables: csm_freshwater_chemistry sheet
    • Benthic Organismal abundance data taxa counts
      • For now create these taxa classes in SRPDIO with a design pattern using something like bco taxonomic inventory process and NCBITaxon purls. If successful could later add it to BCO module. Could use Deign pattern like Pattern similar to http://purl.obolibrary.org/obo/IDOMAL_0000428

geome

https://geome-db.org/

Examples from FuTURES project:

  • template which gets uploaded to geome and contains the darwinCore data which one would specify as relevant/mandatory for a given project. People can then download this and use as the required fields to make a mapping of their dataset to these DC terms.

    • Meghan also sent me FuTRES Sample Project.xlsx which is what users can download from geome which contains the whole desired template.
  • fovt-data-mapping example projects where datasets are individually mapped to the DC and ontology terms, this should produce mapping files like the Samples_output.csv file Meghan sent me.

    • An example script where Meghan is preprocessing and mapping several datasets. steps: 1) Data cleaning, 2) Match column names to (DC) template, 3) Match traits to ontology, 4) Create long version (of dataset as dataframe), 5) Generate unique identifiers, 6) Generate new dataset (new clean dataset or dataset mapping?).

    • also see Mapping Files directory has template_mapping.csv and ontology_codeBook.csv which I guess map the datasets to the DC terms and ontology terms respectively.

  • fovt (FuTURES ontology) has all terms needed and which gets uploaded by geome.

Clone this wiki locally