Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIH AIM:3 YR:2 TASK:2 | 2.3.2 | Research and discovery phase for containers and research objects support #15

Closed
mreekie opened this issue Feb 8, 2023 · 7 comments
Labels
NIH OTA: 2.3.2 pm.GREI https://docs.google.com/document/d/1RdifpHJDFqx8Y8-Dsv_VnnTgezjNHKpSyRei4cw3C-k/edit?usp=sharing pm.GREI-d-2.3.2 NIH, yr2, aim3, task2: Research and discovery phase for containers and research objects support Project: NIH GREI Tasks related to the NIH GREI project

Comments

@mreekie
Copy link
Collaborator

mreekie commented Feb 8, 2023

From discussions with Mahmood

The workflow work for the year 1 deliverables has been release in 5.12.
What is needed is to needed is to add support for additional use cases specific to the biomedical fields.

  • Odum Institute computational workflow was reviewed.
  • Codemeta workflow was reviewed.
  • BioSchema was reviewed

The terms for the MVP were chosen.

  • This is the Computation Workflow Metadata Planning Sheet
  • On this sheet, everything marked MVP has been implemented in 5.12
  • We can use this same sheet to go back and choose to implement additional BioMedcal terms.

The container work is also open.

  • We can engage with Mahmood to do this.
  • Next step here is to touch base with the person from the community who is working on this already (@pdurbin) mentioned the connection today.

Part of this sequence of deliverables:
1.3.1 | 3 | Support software metadata | 5
1.3.2 | 3 | Research and discovery phase for biomedical workflows support | 5
2.3.1 | 3 | Support biomedical workflows | 5
2.3.2 | 3 | Research and discovery phase for containers and research objects support | 5
3.3.1 | 3 | Support containers and research objects  | 10
4.3.1 | 3 | Apply container, RO, workflows support to a few NIH-funded projects | 10

┆Issue is synchronized with this Smartsheet row by Unito

@mreekie mreekie changed the title 2.3.2 3 Research and discovery phase for containers and research objects support 3 | 2.3.2 Research and discovery phase for containers and research objects support Feb 8, 2023
@mreekie mreekie changed the title 3 | 2.3.2 Research and discovery phase for containers and research objects support 3 | 2.3.2 | Research and discovery phase for containers and research objects support Feb 8, 2023
@mreekie
Copy link
Collaborator Author

mreekie commented Feb 8, 2023

This issue represents a deliverable funded by the NIH
This deliverable supports the NIH Initiative to Improve Access to NIH-funded Data

Aim 3: Support standards for sharing code, workflows, and containers

The Harvard Dataverse currently supports depositing any type of file, including code/software and documentation files that accompany data, or files within a research replication package. In this project, we plan to facilitate researchers’ efforts to share and publish their entire workflows or containers that describe the main transformations and analysis of the data, following the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. As a result, the research findings will be portable and reproducible (ideally) with a single command. Though the services will be available to any researcher, special attention will be given to the NIH-funded work. The Dataverse project has already undertaken the development of Codemeta metadata (based on the standard schema.org) within the software. The project will assess the use of Codemeta for research software code and incorporate RO-Crate (for research objects metadata), which allows high flexibility in replication package content. Further, we will explore container metadata and the use of standardized container images for research. Containerization services, including software security scanning, exist for the Harvard Medical School (HMS) O2 high performance computing cluster, are in use by a number of laboratories, and are being developed by BioGrids, a HMS partner that specializes in creating replicable biomedical software packages and containers. As part of this project, we will explore the integration of these containerization services with the Harvard Dataverse repository to support sharing, discovery, and archival of replicable biomedical research.

1.3.1 | 3 | Support software metadata | 5
1.3.2 | 3 | Research and discovery phase for biomedical workflows support | 5
2.3.1 | 3 | Support biomedical workflows | 5
2.3.2 | 3 | Research and discovery phase for containers and research objects support | 5
3.3.1 | 3 | Support containers and research objects  | 10
4.3.1 | 3 | Apply container, RO, workflows support to a few NIH-funded projects | 10

@mreekie
Copy link
Collaborator Author

mreekie commented Feb 8, 2023

monthly January 2023

(2.3.2) We have implemented an intermediate solution that provides the ability to run data analysis in an external container using Binder.

@mreekie mreekie changed the title 3 | 2.3.2 | Research and discovery phase for containers and research objects support NIH AIM:3 YR:2 TASK:2 | 2.3.2 | Research and discovery phase for containers and research objects support Mar 3, 2023
@mreekie mreekie transferred this issue from IQSS/dataverse Mar 3, 2023
@mreekie mreekie added the pm.GREI https://docs.google.com/document/d/1RdifpHJDFqx8Y8-Dsv_VnnTgezjNHKpSyRei4cw3C-k/edit?usp=sharing label Mar 3, 2023
@mreekie mreekie added the pm.GREI-d-2.3.2 NIH, yr2, aim3, task2: Research and discovery phase for containers and research objects support label Mar 18, 2023
@mreekie mreekie changed the title NIH AIM:3 YR:2 TASK:2 | 2.3.2 | Research and discovery phase for containers and research objects support NIH AIM:3 YR:2 TASK:1 & 2 | 2.3.1 & 2 | Containers and research objects support & Support for biomedical workflows Apr 10, 2023
@mreekie mreekie added the pm.GREI-d-2.3.1 NIH, yr2, aim3, task1: Support biomedical workflows label Apr 10, 2023
@mreekie
Copy link
Collaborator Author

mreekie commented Apr 10, 2023

Febrary 2023 update

  • (2.3.1) Initial meeting was held. Work will focus on a new metadata block
    which was added as part of (1.3.2). The next step is finalizing the metadata
    terms. This activity will continue in year 2 as planned.
  • (2.3.2) We recently implemented an intermediate solution that provides the
    ability to run data analysis in an external container using Binder. Few minor
    fixing done during this month.

@mreekie mreekie changed the title NIH AIM:3 YR:2 TASK:1 & 2 | 2.3.1 & 2 | Containers and research objects support & Support for biomedical workflows NIH AIM:3 YR:2 TASK:1 | 2.3.1 | Support biomedical workflows Apr 10, 2023
@mreekie mreekie removed the pm.GREI-d-2.3.2 NIH, yr2, aim3, task2: Research and discovery phase for containers and research objects support label Apr 10, 2023
@mreekie mreekie changed the title NIH AIM:3 YR:2 TASK:1 | 2.3.1 | Support biomedical workflows NIH AIM:3 YR:2 TASK:1 | 2.3.2 | Research and discovery phase for containers and research objects support Apr 10, 2023
@mreekie mreekie changed the title NIH AIM:3 YR:2 TASK:1 | 2.3.2 | Research and discovery phase for containers and research objects support NIH AIM:3 YR:2 TASK:2 | 2.3.2 | Research and discovery phase for containers and research objects support Apr 10, 2023
@mreekie mreekie added pm.GREI-d-2.3.2 NIH, yr2, aim3, task2: Research and discovery phase for containers and research objects support and removed pm.GREI-d-2.3.1 NIH, yr2, aim3, task1: Support biomedical workflows labels Apr 10, 2023
@mreekie
Copy link
Collaborator Author

mreekie commented Apr 10, 2023

March Update

We recently implemented a solution that provides the ability to run data analysis in an external container using Binder. We improved the code underlying Binder (repo2docker) to allow it to download tabular data from Dataverse in its original format (e.g. Stata rather than an archive-friendly tab separated values format).

@pdurbin
Copy link
Member

pdurbin commented Apr 11, 2023

Febrary 2023 update

* (2.3.1) Initial meeting was held.

@mreekie any notes from the initial meeting?

@cmbz
Copy link
Contributor

cmbz commented Jun 1, 2023

May 2023 Update: An initial meeting was held with interested stakeholders on 2023/05/05 to discuss requirements for supporting research objects beyond datasets and preliminary design approaches including an option for creating a Dataverse database entry for object types. Follow-up discussions will be planned.

@cmbz cmbz added the Project: NIH GREI Tasks related to the NIH GREI project label Jan 3, 2024
@cmbz
Copy link
Contributor

cmbz commented Jan 3, 2024

2024/01/03: Closing, work will be tracked here: #146

@cmbz cmbz closed this as completed Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NIH OTA: 2.3.2 pm.GREI https://docs.google.com/document/d/1RdifpHJDFqx8Y8-Dsv_VnnTgezjNHKpSyRei4cw3C-k/edit?usp=sharing pm.GREI-d-2.3.2 NIH, yr2, aim3, task2: Research and discovery phase for containers and research objects support Project: NIH GREI Tasks related to the NIH GREI project
Projects
None yet
Development

No branches or pull requests

3 participants