Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DCN July 2020 Hackathon updates to Jupyter Notebook Primer #19

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

kekoziar
Copy link
Contributor

@kekoziar kekoziar commented Jul 17, 2020

Separate commits detail proposed changes.
To summarize the proposed changes:

  • Expanded and clarified sections that refer to computer science terms.
    • Clarified Kernels and how cells run
    • Clarified and added examples of dependencies and citation files
    • Expanded first key curatorial question
  • Added to and clarified examples of notebooks which are archived in repositories
  • Minor corrections: version number, how a resource was referenced, broken links, renumbered endnotes due to additions/minor changes, added title/alt text to images

PR made on behalf of our team:
@kozlowwe
@gjanee
@srerickson
@cincyamyK
@kekoziar
@gdntmoon

kekoziar added 9 commits July 16, 2020 14:10
Update Jupyter Notebook version number in the format overview table
The guidance is provided by the Software Sustainability Institute (1), and funded by Jisc (2).
Clarified for curators unfamiliar with computer science terminology the relation between a kernel and programming language.

Elaborated on the cell order and expectations of users (those who download a notebook)
expand dependencies section to include other types of dependencies file.
Annotate citation.cff
Clarify that a container metafile is appropriate to request if used.
Added annotations and clarifications.
Add clarifying question to help curator unfamiliar with code. 
Add examples of ipynb archived in data repositories.
add/renumber associated end-notes.
Add title and alt text for decision tree images.
@@ -100,10 +100,10 @@ The following elements outline recommendations for repositories accepting Jupyte
- Additional files to request:
- PDF of the Jupyter Notebook (export from Jupyter web application or [nbviewer](https://nbviewer.jupyter.org/))
- reST export of the Jupyter Notebook (export from Jupyter web application)
- CodeMeta.json
- CITATION.cff
- CodeMeta.json, requirements.txt, or environment.yml (dependencies)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend listing CodeMeta.json as preferred at least as it provides the ability to define more extensive structured metadata using a controlled vocab.

- CodeMeta.json
- CITATION.cff
- CodeMeta.json, requirements.txt, or environment.yml (dependencies)
- CITATION.cff (a software citation file appropriate if not depositing in a repository)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a citation file is always appropriate— many repositories do not have the fields necessary to automatically generate a proper software citation.

- Documents what the Jupyter Notebook is for
- Request that this file include citation(s) to third-party algorithms and analyses
- Recommend code comments within the Notebook file itself in addition to the README file
- Documents what the Jupyter Notebook is for (but recommendation is that the Notebook utilize code comments)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code comments should not be seen as a replacement or alternative to providing a README file. The code comments are used to describe what specific sets of cells do, but the notebook itself can have a much broader description and context.

- CITATION.cff for the Notebook
- Preferred citation; should enable native software citation
- Relevant if the Notebook is not being submitted to a repository

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always relevant

@kekoziar
Copy link
Contributor Author

kekoziar commented Sep 9, 2020

@dbouquin IIRC, we're not saying to not have dependencies listed or citation information; there was concern regarding recommending very specific file types (CITATION.cff and CodeMeta.json) without appropriate explanation of and assistance to help create them.

I think it would be helpful to new curators who aren't familiar with python notebooks and these files to include a link to an example dataset that includes these files. Can you link one?

@dbouquin
Copy link

Do you think something like this would work? Not sure what you mean by dataset here. https://doi.org/10.5281/zenodo.3953146 (This is code that generates CodeMeta files for R packages— there's a codemeta.json file included)
Here's another random example from Zenodo: https://doi.org/10.5281/zenodo.2610844

@kekoziar
Copy link
Contributor Author

While dataset may be used broadly, I mean dataset specific to this primer. That would be a Python notebook that is an example of the recommended curation level.

@dbouquin
Copy link

dbouquin commented Sep 16, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants