Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author guide - when is a package "good enough" for review #101

Open
lwasser opened this issue Oct 19, 2021 · 8 comments
Open

Author guide - when is a package "good enough" for review #101

lwasser opened this issue Oct 19, 2021 · 8 comments

Comments

@lwasser
Copy link
Member

lwasser commented Oct 19, 2021

This came up in the potential review of pygmt. THis is a great package with excellent infrastructure and community around it. However right now the readme disclaimer suggests that the package is under heavy development which may impact the API significantly.

This actually is the point where a review could be extremely helpful. But we don't want to review a package that will then change significantly after it's accepted. But we could review a package and wait to accept until said changes happen.

Here is some language that ropensci uses brought up in a discuss with @NickleDave :


If we were to adopt language like this
At the submission stage, all major functions should be stable enough to be fully documented and tested; the README should make a strong case for the package (the editors will read it to e.g. evaluate this as in scope or not.).
We strongly suggest submitting your package for review before publishing on CRAN <PYPI> or submitting a software paper describing the package to a journal. Review feedback may result in major improvements and updates to your package, including renaming and breaking changes to functions.
Your package will continue to evolve after review, this book provides guidance about the topic.

tagging @leouieda in case he has thoughts! and anyone else is welcome to add thoughts here. i need to work on the author guide.

@NickleDave
Copy link
Contributor

NickleDave commented Oct 19, 2021

+1 for adopting similar language to ROpenSci

Sharing the link here that you provided @lwasser just in case it helps provide context:
https://devguide.ropensci.org/guide-for-authors.html

They have it as a set of bullet points:

  • Please consider the best time in your package’s development to submit. Your package should be sufficiently mature so that reviewers are able to review all essential aspects, but keep in mind that review may result in major changes.
    • For any submission or pre-submission inquiry the README of your package should provide enough information about your package (goals, usage, similar packages) for the editors to assess its scope without having to install the package. Even better, set up a pkgdown website for allowing more detailed assessment of functionality online.
    • If you use repostatus.org badges (which we recommend), submit when you’re ready to get an Active instead of WIP badge. Similarly, if you use lifecycle badges, submission should happen if the package is at least Maturing.
    • At the submission stage, all major functions should be stable enough to be fully documented and tested; the README should make a strong case for the package (the editors will read it to e.g. evaluate this as in scope or not.).
    • We strongly suggest submitting your package for review before publishing on CRAN or submitting a software paper describing the package to a journal. Review feedback may result in major improvements and updates to your package, including renaming and breaking changes to functions.
    • Your package will continue to evolve after review, this book provides guidance about the topic.

@lwasser
Copy link
Member Author

lwasser commented Oct 19, 2021

thanks @NickleDave !! this is the language i'm considering for this review. Again this package is perfectly in scope so the question remains WHEN is appropriate to review? and do we consider a 2 stage review if its useful now during heavy development? or is that harder on our community if we know big changes are coming to a package?

At the submission stage, all major functions should be stable enough to be fully documented and tested; the README should make a strong case for the package (the editors will read it to e.g. evaluate this as in scope or not.).

@NickleDave
Copy link
Contributor

Right, my initial reaction is that if we were to adopt similar criteria then pyGMT would be appropriate to review at this time, because

  • At the submission stage, all major functions should be stable enough to be fully documented and tested;

A quick look at metrics suggest this is the case, e.g. their codecov is at 99%
https://app.codecov.io/gh/GenericMappingTools/pygmt

  • the README should make a strong case for the package

"It provides a Pythonic interface for the Generic Mapping Tools (GMT), a command-line program widely used in the Earth Sciences."

seems like a very clear case for the package to exist

and re: the disclaimer on the README about "rapid development", seems like one way to move in a constructive way towards a more stable version would be to go through review now, as this point from the rOpenSci docs suggests:

  • We strongly suggest submitting your package for review before publishing on CRAN or submitting a software paper describing the package to a journal. Review feedback may result in major improvements and updates to your package, including renaming and breaking changes to functions.

@lwasser
Copy link
Member Author

lwasser commented Oct 19, 2021

i like this @NickleDave so perhaps i just ask them about this statement:

All of the API (functions/classes/interfaces) is subject to change until we reach v1.0.0 as per the semantic versioning specification. There may be non-backward compatible changes as we experiment with new design ideas and implement new features. This is not a finished product, use with caution.

If they don't anticipate massive API changes at this point, then we review. I do want to see this reviewed! so perhaps i'll respond in the issue and ask that specific question about api changes.

@NickleDave
Copy link
Contributor

Yes exactly!
That's what I was just about to write :)

@bocklund
Copy link

(Take my comment with a grain of salt: I've participated in the JOSS community but not in the pyOpenSci community).

Part of the Guide for Reviewers suggest that the UI and API are a part of the review process:

  • Are there user interface improvements that could be made?
  • Were functions and arguments named to work together to form a common, logical programming API that is easy to read, and autocomplete?
  • If you have your own relevant data/problem, work through it with the package. You may find rough edges and use-cases the author didn’t think about.

To me, this suggests that the API is a fundamental part of the package and the review, to be considered on the same level as the functionality of the package. Using that comparison, expecting to make large changes in the API would be like expecting to make large changes in the functionality. If large changes in the functionality are expected, is the package ready for review? Is that different than the API?

An alternative perspective from my experience with JOSS: the point of something like JOSS is to have a cite-able artifact for the work so the authors can get credit. The JOSS review criteria don't mention API design or UI explicitly in the criteria and focus more on functionality. That makes sense for the scope of packages published in JOSS because, presumably, someone who would cite the JOSS paper is using the work primarily for the functionality. One could argue that a usable API plays a role in whether a package is used (and therefore cited), but I'm probably going to use the package if it's the only one implementing a specific functionality I need, even if the API is not polished or stable. Of course, the scope and goals of PyOpenSci are different from JOSS, but I think it makes for a good contrasting case on why a (mostly) complete, designed API might not need to be a part of a PyOpenSci review.

@leouieda
Copy link
Member

Thanks for starting this discussion @lwasser (and pinging me)! My 2-cents:

I see a bit of contradiction in the requirements mentioned above:

  1. "all major functions should be stable enough to be fully documented and tested" seems to be interpreted here as "the API is stable and won't have major breaking changes", which is what is causing the issue with the PyGMT disclaimer.
  2. "but keep in mind that review may result in major changes" suggests that for the review to pass, you may have to break API backwards compatibility and make major changes. This is in contradiction to the first point ("your package needs to be stable" vs "review may break your stability").

The suggestion to go through review before publishing to PyPI is really not realistic since most software will go on PyPI in alpha/beta stage (there is even a metadata flag for this) and need to start being used in order to reach stability. I would advice removing that and just mention "before publication in a journal" instead. As an example, PyGMT would not be where it is now if we hadn't made it available very early on to attract other developers to the community. Another example, scikit-learn would probably not be ready for review until the recent 1.0 based on this since they were doing 2-release deprecation cycles that broke the API.

There is also the consideration that many software undergo major breaking changes from time to time, like complete redesigns to account for new technology (e.g. the redesign of PyMC4 based on a new backend; which eventually was stopped). I'm guessing pyOpenSci wouldn't to introduce language that discourages this since it's a natural process for software.

I understand the intention of these criteria but realizing them in a way that is general enough is tricky. Maybe someone who is better with words can summarise the following:

  • Functions/classes (the API) are stable or close to reaching stability after the review is complete.
  • "Stable" means that the scope of the package is settled and well defined and that backwards-incompatible changes to the API will be managed carefully.
  • Major changes and refactoring can still be carried out after review but should not be a frequent occurrence.
  • Most functionality should be clearly documented and tested.

@lwasser
Copy link
Member Author

lwasser commented Oct 20, 2021

@bocklund @leouieda thank you so much for the above. To provide some reference here, the language we have grabbed is language used by rOpenSci. I'm finding a lot of differences between the r and python communities as we move forward. @leouieda the pypi one is prominent. We have had one package that wasn't on pypi (and i want to add that as a check for end of review) but most are. And some of the other language suggestions above make a lot of sense. I will read your comments above again and will post here a take at summarizing it. that text will then go into our review guides!

NOTE: in the future, I plan to create advisory boards that can provide insight / guidance into what these guides say so our guidelines are truly Python community driven. Perhaps you can be involved when that happens in directing the criteria! now i'm just trying to improve our language to ensure our reviews go smoothly and this discussion is extremely helpful and much appreciated. I will reply with some draft text in the next few days!!

Again thank you both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants