Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Task: Connect ATN and MBON into IOOS DMAC #52

Open
2 tasks
MathewBiddle opened this issue May 1, 2024 · 5 comments
Open
2 tasks

New Task: Connect ATN and MBON into IOOS DMAC #52

MathewBiddle opened this issue May 1, 2024 · 5 comments
Assignees
Labels
ATN Issues relating to the Animal Telemetry Network enhancement New feature or request MBON Issues relating to the Marine Biodiversity Observation Network

Comments

@MathewBiddle
Copy link
Contributor

Who is requesting this?

@ioos/marine-life

What is being requested?

Connect ATN and MBON into IOOS DMAC. Coordinate with IOOS Catalog developers (POC: @mwengren) on how ATN and/or MBON portals could be harvested for data.ioos.us. Guidance for the process to add records is documented at https://ioos.github.io/catalog/

What is the requested deadline and why?

No response

What is the current status quo (i.e., what happens if this does not get done)?

ATN and MBON datasets wont show up in data.ioos.us.
Marine Life will not meeting IOOS DMAC requirements by being discoverable in data.ioos.us.

What indicates this is done (i.e., how do we know this is complete)?

  • ATN catalog is discoverable in data.ioos.us
  • MBON catalog is discoverable in data.ioos.us

Provide a description or any other important information.

xref:

@mwengren
Copy link
Member

mwengren commented Jun 5, 2024

Copying my comments from ioos/ckanext-ioos-theme#237 (comment) below:

AFAIK IOOS is required to furnish ISO XML metadata (or perhaps DCAT JSON, not 100% sure on that alternative) to NOAA for inclusion in NOAA's enterprise data inventories for all of our publicly-available data/services.

For all of IOOS' non-bio data, it's been fairly straightforward to do this as most of the software we use has been developed to able to output an ISO XML metadata representation of the datasets they serve. Since that isn't the case for OBIS, MBON, or ATN (I believe), that's something we'll need to address for both including those data in IOOS Catalog given its current capabilities, and also for sending up the chain to NOAA to meet requirements.

It may be that leveraging IOOS Catalog and converting the various bio data formats to ISO XML format isn't the best approach to meeting NOAA data inventory requirements. If there are better, simpler ways to furnish these metadata to NOAA that I'm not aware, we should consider those options. Catalog has been our solution to date, but primarily because of the pre-existing metadata format support and compatibility.

Ideally, we can have a comprehensive inventory of 100% of IOOS' data in Catalog, and I think we should still aim for that goal, but we need to understand better what the challenges for that might be wrt ATN, MBON, or other bio/Marine Life data.

@MathewBiddle
Copy link
Contributor Author

Thanks @mwengren.

For ATN, at some point, we hope to add non-embargoed data to an ATN ERDDAP which could be an easy pathway for that observing method. See #44

For MBON, we are encouraging the MBON projects to work with RAs to host the raw data on an RA ERDDAP (or other web service as applicable). Most of the RA ERDDAPs are already being harvested, hence the push for that collaboration. Below is an example:

Another wrinkle in the whole pipeline is that OBIS-USA is being archived at NCEI on a quarterly basis. Part of our guidance is to submit data to OBIS-USA. While that metadata record is not available through the IOOS Catalog, it is available through the various NOAA and higher Catalogs. So, does that meet our NOAA data inventory requirements?? See links below:

The data flow diagram might help illustrate all the nuances https://ioos.github.io/mbon-docs/mbon-data-flow.html

@mwengren
Copy link
Member

@MathewBiddle That makes sense on the data flow and connection in with the RA ERDDAPs, I recall that plan now... thanks for adding the example.

I think the OBIS-USA/NCEI archive probably does meet the NOAA data publishing/open data requirements for those data - at least from what I understand.

I think our goal should be to include both access points (NCEI archive and RA ERDDAP) at the NOAA Catalog level (i.e. OneStop). The IOOS Catalog should include all data access services provided by the RAs, or other IOOS DACs, that are funded and supported by IOOS.

Having two separate metadata records for the same dataset should be OK as well as they'll be describing different endpoints to access the same data, presumably. Ideally there would be a way to relate each metadata record to the other within the NOAA Catalog, but I'm not sure that is technically possible at present. That might be a good requirement to share with the OneStop team though.

I guess the one scenario that seems to be a potential gap where IOOS-funded bio data might not be represented in IOOS Catalog is if a provider is not serving their data via RA ERDDAP, but are aligning them to Darwin Core and submitting to NCEI.

Ideally, we could also represent those raw data access points, whatever they might be, in IOOS Catalog as well, even if they would be technically meeting the NOAA open data publishing guidelines via OBIS/NCEI archive pathway.

I don't know how much of a priority or how common this is... maybe would provide justification to encourage those providers to work with an RA to publish to ERDDAP, however.

@laurabrenskelle
Copy link
Contributor

@mwengren Is there a reason you couldn't share the RA ERDDAP link as another data access link in the collection metadata record at NCEI? It doesn't seem ideal to have two collection records for the same dataset in OneStop. Here is an example: https://data.noaa.gov/onestop/collections/details/573b7dc1-7d06-4fdc-a134-056c112c2260

@MathewBiddle
Copy link
Contributor Author

I guess the one scenario that seems to be a potential gap where IOOS-funded bio data might not be represented in IOOS Catalog is if a provider is not serving their data via RA ERDDAP, but are aligning them to Darwin Core and submitting to NCEI.

I think this might be more common with cross funded efforts, like MBON. Some projects use EDI and Arctic Data Center as their repositories (maybe BCO-DMO too).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ATN Issues relating to the Animal Telemetry Network enhancement New feature or request MBON Issues relating to the Marine Biodiversity Observation Network
Projects
Status: ToDo
Development

No branches or pull requests

3 participants