Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Oct 11, 2024
1 parent 6b93ea6 commit 49ba701
Show file tree
Hide file tree
Showing 20 changed files with 88 additions and 32 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
eb1f2ff1
01dbea7f
3 changes: 1 addition & 2 deletions _tex/elsarticle.cls
Original file line number Diff line number Diff line change
Expand Up @@ -1011,8 +1011,7 @@
\ifx\@elsarticlemyfooteralign\@elsarticlemyfooteralignright%
{}\hfill\@elsarticlemyfooter%
\else%
Preprint submitted to \ifx\@journal\@empty%
Elsevier%
\ifx\@journal\@empty%
\else\@journal\fi\hfill\@date\fi%
\fi%
\fi%
Expand Down
52 changes: 43 additions & 9 deletions _tex/index.tex
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@
\PassOptionsToPackage{dvipsnames,svgnames,x11names}{xcolor}
%
\documentclass[
number,
preprint]{elsarticle}
number]{elsarticle}

\usepackage{amsmath,amssymb}
\usepackage{iftex}
Expand Down Expand Up @@ -157,28 +156,39 @@

\begin{frontmatter}
\title{Towards an open-source model for data and metadata standards}
\author[1]{Ariel Rokem%
\author[1,2]{Ariel Rokem%
\corref{cor1}%
}

\author[1]{Vani Mandava%
\ead{[email protected]}
\author[3,2]{Vani Mandava%
%
}

\author[1]{Nicoleta Cristea%
\author[4,2]{Nicoleta Cristea%
%
}

\author[1]{Anshul Tambay%
\author[3,2]{Anshul Tambay%
%
}

\author[1]{Andrew J. Connolly%
\author[5,2]{Andrew J. Connolly%
%
}


\affiliation[1]{organization={University of Washington},,postcodesep={}}
\affiliation[1]{organization={University of Washington, Department of
Psychology},city={Seattle},country={USA},countrysep={,},postcodesep={}}
\affiliation[2]{organization={University of Washington, eScience
Institute},city={Seattle},country={USA},countrysep={,},postcodesep={}}
\affiliation[3]{organization={University of Washington, Scientific
Software Engineering
Center},city={Seattle},country={USA},countrysep={,},postcodesep={}}
\affiliation[4]{organization={University of Washington, Department of
Civil and Environmental
Engineering},city={Seattle},country={USA},countrysep={,},postcodesep={}}
\affiliation[5]{organization={University of Washington, Department of
Astronomy},city={Seattle},country={USA},countrysep={,},postcodesep={}}

\cortext[cor1]{Corresponding author}

Expand All @@ -187,6 +197,30 @@



\begin{abstract}
Progress in machine learning and artificial intelligence promises to
advance research and understanding across a wide range of fields and
activities. In tandem, increased awareness of the importance of open
data for reproducibility and scientific transparency is making inroads
in fields that have not traditionally produced large publicly available
datasets. Data sharing requirements from publishers and funders, as well
as from other stakeholders, have also created pressure to make datasets
with research and/or public interest value available through digital
repositories. However, to make the best use of existing data, and
facilitate the creation of useful future datasets, robust, interoperable
and usable standards need to evolve and adapt over time. The open-source
development model provides significant potential benefits to the process
of standard creation and adaptation. In particular, data and meta-data
standards can use long-standing technical and socio-technical processes
that have been key to managing the development of software, and which
allow incorporating broad community input into the formulation of these
standards. On the other hand, open-source models carry unique risks that
need to be considered. This report surveys existing open-source
standards development, addressing these benefits and risks. It outlines
recommendations for standards developers, funders and other stakeholders
on the path to robust, interoperable and usable open-source data and
metadata standards.
\end{abstract}



Expand Down
Binary file modified index.docx
Binary file not shown.
27 changes: 25 additions & 2 deletions index.html

Large diffs are not rendered by default.

Binary file modified index.pdf
Binary file not shown.
2 changes: 1 addition & 1 deletion sections/01-introduction.embed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"\n",
"Data and metadata standards that use tools and practices of OSS (“open-source standards” henceforth) reap many of the benefits that the OSS model has provided in the development of other technologies. The present report explores how OSS processes and tools have affected the development of data and metadata standards. The report will survey common features of a variety of use cases; it will identify some of the challenges and pitfalls of this mode of standards development, with a particular focus on cross-sector interactions; and it will make recommendations for future developments and policies that can help this mode of standards development thrive and reach its full potential."
],
"id": "68bde417-020e-4d9e-a92f-f840cd71e28e"
"id": "d02bb291-b4b8-4602-a2a8-23d2dd510c62"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/01-introduction.out.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"\n",
"Wilkinson, Mark D, Michel Dumontier, I Jsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” *Sci Data* 3 (March): 160018."
],
"id": "7da720b3-7fa5-403a-af57-3bf15d92134b"
"id": "48c2a4bb-4105-41c4-8095-10d9e6097de2"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/02-use-cases.embed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
"\n",
"Another interesting use case for open-source standards is community/citizen science. An early example of this approach is OpenStreetMap <https://www.openstreetmap.org>, which allows users to contribute to the project development with code and data and freely use the maps and other related geospatial datasets. But this example is not unique. Overall, this approach has grown in the last 20 years and has been adopted in many different fields. It has many benefits for both the research field that harnesses the energy of non-scientist members of the community to engage with scientific data, as well as to the community members themselves who can draw both knowledge and pride in their participation in the scientific endeavor. It is also recognized that unique broader benefits are accrued from this mode of scientific research, through the inclusion of perspectives and data that would not otherwise be included. To make data accessible to community scientists, and to make the data collected by community scientists accessible to professional scientists, it needs to be provided in a manner that can be created and accessed without specialized instruments or specialized knowledge. Here, standards are needed to facilitate interactions between an in-group of expert researchers who generate and curate data and a broader set of out-group enthusiasts who would like to make meaningful contributions to the science. This creates a particularly stringent constraint on transparency and simplicity of standards. Creating these standards in a manner that addresses these unique constraints can benefit from OSS tools, with the caveat that some of these tools require additional expertise. For example, if the standard is developed using git/GitHub for versioning, this would require learning the complex and obscure technical aspects of these system that are far from easy to adopt, even for many professional scientists."
],
"id": "ef5062bb-b877-4616-af91-700689bf08f5"
"id": "ff672cd8-8dce-4c1b-9aac-08cc40350ede"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/02-use-cases.out.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
"\n",
"Wells, Donald Carson, and Eric W Greisen. 1979. “FITS-a Flexible Image Transport System.” In *Image Processing in Astronomy*, 445."
],
"id": "2c933bd2-5896-4057-b68c-bf2f3c8b7c4e"
"id": "9b5dc5b7-dbc4-4772-9858-2881bd004aea"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/03-challenges.embed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
"\n",
"The development of open-source standards faces similar sustainability challenges to those faced by open-source software that is developed for research. Standards typically develop organically through sustained and persistent efforts from dedicated groups of data practitioners. These include scientists and the broader ecosystem of data curators and users. However, there is no playbook on the structure and components of a data standard, or the pathway that moves the implementation of a specific data architecture (e.g., a particular file format) to become a data standard. As a result, data standardization lacks formal avenues for success and recognition, for example through dedicated research grants (and see @sec-cross-sector). This hampers the long-term trajectory that is needed to inculcate a standard into the day-to-day practice of researchers."
],
"id": "7d52d5df-1bff-4f33-953f-19961bc39f4f"
"id": "38d90182-3c94-44b8-8717-52db26fa239f"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/03-challenges.out.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
"\n",
"Scroggins, Michael, and Bernadette M Boscoe. 2020. “Once FITS, Always FITS? Astronomical Infrastructure in Transition.” *IEEE Ann. Hist. Comput.* 42 (2): 42–54."
],
"id": "fd61dd54-746b-4707-a6be-52949ef8fc38"
"id": "f5bee5d8-62e8-4a34-a57b-742e348bf718"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/04-cross-sector.embed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"\n",
"Interactions of data and meta-data standards with commercial interests may provide specific sources of friction. This is because proprietary/closed formats of data can create difficulty at various transition points: from one instrument vendor to another, from data producer to downstream recipient/user, etc. On the other hand, in some cases, cross-sector collaborations with commercial entities may pave the way to robust and useful standards. For example, imaging measurements in human subjects (e.g., in brain imaging experiments) significantly interact with standards for medical imaging, and chiefly the Digital Imaging and Communications in Medicine (DICOM) standard, which is widely used in a range of medical imaging applications, including in clinical settings \\[@Larobina2023-vq, @Mustra2008-xk\\]. The standard emerged from the demands of the clinical practice in the 1980s, as digital technologies were came into widespread use in medical imaging, through joint work of industry organizations: the American College of Radiology and the National Association of Electronic Manufacturers. One of the defining features of the DICOM standard is that it allows manufacturers of instruments to define “private fields” that are compliant with the standard, but which may include idiosyncratically organized data and/or metadata. This provides significant flexibility, but can also easily lead to the loss of important information. Nevertheless, the human brain imaging case is exemplary of a case in which industry standards and research standards coexist and need to communicate with each other effectively to advance research use-cases, while keeping up with the rapid development of the technologies."
],
"id": "0bebd3ba-5622-4f33-9807-7960ad37c350"
"id": "e94ab12b-95c1-4d7d-8192-aba2657707bf"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/04-cross-sector.out.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
"\n",
"The National Science and Technology Council. 2022. “Desirable Characteristics of Data Repositories for Federally Funded Research.” *Executive Office of the President of the United States, Tech. Rep*."
],
"id": "d6f8e2f9-560a-42df-bdb6-cf3905d5f6e9"
"id": "6ca2495f-f0dd-49b2-be58-a8c7963cf025"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/05-recommendations.embed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
"\n",
"Encourage cross-sector and cross-domain alliances that can impact successful standards creation. Invest in robust program management of these alliances to align pace and create incentives (for instance via Open Source Program Offices at Universities or other research organizations). Similar to program officers at funding agencies, standards evolution need sustained PM efforts. Multi-party partnerships should include strategic initiatives for standard establishment such as the Pistoia Alliance (<https://www.pistoiaalliance.org/>)."
],
"id": "3e49d810-d49a-4958-bcf7-55426adb57e7"
"id": "50ce2ff0-f8a1-4217-86eb-719284b7497f"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/05-recommendations.out.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@
"\n",
"Van Tuyl, Steve, ed. 2023. “Hiring, Managing, and Retaining Data Scientists and Research Software Engineers in Academia: A Career Guidebook from ADSA and US-RSE.” https://doi.org/<https://doi.org/10.5281/zenodo.8329337>."
],
"id": "eb33bfc5-b60f-4b10-8bda-29232ec74ea8"
"id": "9ab74357-d558-4d82-8e42-0ea2d9744e94"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/06-acknowledgments.embed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"\n",
"The workshop and this report were funded through [NSF grant #2334483](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2334483&HistoricalAwards=false) from the NSF [Pathways to Enable Open-Source Ecosystems (POSE)](https://new.nsf.gov/funding/opportunities/pathways-enable-open-source-ecosystems-pose) program. The opinions expressed in this report do not necessarily reflect those of the National Science Foundation."
],
"id": "5a18aaf7-37e3-4e6a-a304-c2d751ff10ec"
"id": "3fc54093-97fa-47c3-b4ef-7b24d6fc05ef"
}
],
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion sections/06-acknowledgments.out.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"\n",
"The workshop and this report were funded through [NSF grant #2334483](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2334483&HistoricalAwards=false) from the NSF [Pathways to Enable Open-Source Ecosystems (POSE)](https://new.nsf.gov/funding/opportunities/pathways-enable-open-source-ecosystems-pose) program. The opinions expressed in this report do not necessarily reflect those of the National Science Foundation."
],
"id": "984dce61-ec9d-4dde-9892-f301b193aea0"
"id": "ba5572ae-0800-44e1-bbc1-c1b6ed88796c"
}
],
"nbformat": 4,
Expand Down
6 changes: 3 additions & 3 deletions sections/07-participants.embed.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"#"
],
"id": "57fde416-2560-43ab-8de7-fefcc0e58fe4"
"id": "4153eeae-c6e8-4454-aa23-75e01d7e6c3c"
},
{
"cell_type": "raw",
Expand All @@ -16,7 +16,7 @@
"source": [
"\\newpage"
],
"id": "e8974446-5b62-449f-8597-a51dd2e3f857"
"id": "d901c59b-f4d3-432f-a824-4c0ed4d9fa65"
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -56,7 +56,7 @@
"| Yaroslav Halchenko | Dartmouth University |\n",
"| Ziheng Sun | George Mason University |"
],
"id": "aedfbf1a-c4e4-4278-9098-503e5491c72b"
"id": "2371d360-6936-4bcb-ac12-421c5d52c27b"
}
],
"nbformat": 4,
Expand Down
6 changes: 3 additions & 3 deletions sections/07-participants.out.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"#"
],
"id": "e360e076-4c8a-4ace-b97b-a2fcb2c6e424"
"id": "ec366097-577e-4a8a-ac3e-61b1168bf13d"
},
{
"cell_type": "raw",
Expand All @@ -16,7 +16,7 @@
"source": [
"\\newpage"
],
"id": "ffe29adb-59ae-44b3-a68d-6c97a386e373"
"id": "1f205ac9-5c6d-4b1f-8716-10a7cc4f2f8e"
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -56,7 +56,7 @@
"| Yaroslav Halchenko | Dartmouth University |\n",
"| Ziheng Sun | George Mason University |"
],
"id": "b589ce0b-ff24-4057-915b-215a91e7ab02"
"id": "77ad6633-9631-4c37-b07b-18e219d45e40"
}
],
"nbformat": 4,
Expand Down

0 comments on commit 49ba701

Please sign in to comment.