Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: updated single_source_version with a much simpler page. #1578

Closed

Conversation

ChrisBarker-NOAA
Copy link
Contributor

@ChrisBarker-NOAA ChrisBarker-NOAA commented Jul 25, 2024

Initial comment edited as of Jul 30 2024

This was quick draft I whipped out -- turns out there's interest,

As per discussion on Discourse, and #1276, here is a single source page that is far simpler, and essentially refers users to their build systems for instructions.

NOTE: There is disagreement about what should be done re versioning and a __version__ attribute. This PR is NOT the place to resolve those disagreements.

Rather, I'm hoping this page can capture the state of the practice without specification of every detail of what should be done.

I am hoping to at least capture what I think is the consensus:

The very short version:

  • for a given distribution, the version string(s) should be fully consistent, and be specified in only one place.

Some detailed points:

  1. When a distribution is built, it should have proper PEP-conforming version info, which is consistent within that entire distribution.

  2. The user, when setting up their distribution build, should specify the version string in only one place (or only one way) - and the build system should be responsible for making sure it is consistent everywhere else (e.g. the filenames, the metadata, or attributes.

  3. If a distribution maintainer wants to put an attribute in the installed importable module it should be consistent with the metadata of the installed package.

  4. It's best for the community that an importable version attribute uses a consistent name: __version__ is a defacto standard for that name.

If you disagree with these points, please make that argument specifically, rather than suggesting edits that don't conform to these points. That is, let's get a consensus for what this doc should say first, and then figure out how to say it.

I think (4) is the perhaps the most controversial -- though I don't understand why. I can't see how it benefits ANYONE for different distributions to use a different name for the version string. Frankly, the only explanation I can think of for why people advocate for other names, is that that way it's harder for anyone to think that having a version string is a standard -- and they really don't want having a version string to be a standard.
But I ask that if you feel that way, please make sure that the text is clear that providing a version string is OPTIONAL, rather than arguing against us using the most common name in the text.

Anyway -- my goal with this doc is to capture those points, and not be prescriptive about what should be done about having or not having a version string in the installed package.

NOTE: the preview doesn't seem to be working -- I get a 404 maybe because of the merge conflict? (which I thought I'd wait for something closer to final before worrying about)


📚 Documentation preview 📚: https://python-packaging-user-guide--1578.org.readthedocs.build/en/1578/

refering folks to their build system of choice.
source/single_source_version.rst Outdated Show resolved Hide resolved
source/single_source_version.rst Outdated Show resolved Hide resolved
source/single_source_version.rst Outdated Show resolved Hide resolved
@ChrisBarker-NOAA
Copy link
Contributor Author

By the way, when putting in links to the build system's docs, I noticed that they all support __version__, and at least flit does it by default.

So it is absolutely a common practice.

source/single_source_version.rst Outdated Show resolved Hide resolved
source/single_source_version.rst Outdated Show resolved Hide resolved
#. Set the value to a ``__version__`` global variable in a dedicated module in
your project (e.g. ``version.py``), then have ``setup.py`` read and ``exec`` the
value into a variable.
.. _how_to_set_version_links:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like other references use spaces. Perhaps, follow the same style?


#. Read the file in ``setup.py`` and parse the version with a regex. Example (
from `pip setup.py <https://github.com/pypa/pip/blob/1.5.6/setup.py#L33>`_)::
* A package may set a top level ``__version__`` attribute to provide runtime access to the version of the installed package. If this is done, the value of ``__version__`` attribute and that used by the build system to set the distribution's version should be kept in sync in :ref:`the build systems's recommended way <how_to_set_version_links>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this introduced a lot of controversy in the past. The way it's written here, it reads as though the “__version__” name specifically must be used and nothing else. The reality is that the variable name can be arbitrary. And it doesn't have to be top-level to work, either. Additionally, I see that here and in the previous point, the word “package” is used, but it means different things. Here, you talk about an importable, while there it's about a distribution. Can we do anything about this?

Could you try rephrasing this to say that an in-project importable module may have a hard-coded variable that can be wired into packaging metadata by name? We can mention that many projects use __version__ for the variable name, but there is no standard mandating it to be that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested an edit in response to your comment.

encoding=kwargs.get("encoding", "utf8")
) as fp:
return fp.read()
In the cases where a package does not set a top level ``__version__`` attribute, the version may still be accessible using ``importlib.metadata.version("distribution_name")``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, we should maybe mention that __version__ may be populated from importlib.metadata.version(__name__) to provide the version string from installed packaging metadata as an importable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As near as I can tell, the push to get rid of __version__ came from people protesting to this pattern (as a waste of runtime on import).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm starting to understand this better now, having read another thread: #1578 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also comes from the fact that __version__ is just a conventional place to duplicate the information in that already has a canonical place: not very pythonic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also comes from the fact that version is just a conventional place to duplicate the information in that already has a canonical place: not very pythonic

I respectfully, but completely disagree with this characterization. __version__ is the (defacto) standard which predates the existence of pip by at least 7 years (in Matplotlib I have a commit from 2004 using __version__) let alone the current push to bootstrap another (Python -specific) packaging ecosystem.

From my perspective the "duplicate" here is dist-info and as I noted on discourse I am not worried about this. Package managers (apt/dpkg, yum, pacman, conda, pip, ....) all need their own copy of the version metadata (and dependencies) to do their jobs. This is OK and expected as each of these systems, and runtime, have different needs and hence (may) need their own copy of the information.

It comes across as very hostile to invent a new place to put the version information and then declare the existing location an upythonic duplicate!

As a thought experiment, if I said "We can scrap all of dist-info, we can query pacman to get all of it when we need it" I don't think it would be taken seriously. "just use the metadata for pip" reads exactly the same to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flying-sheep I think that it's a bit more complicated than that. Duplicating the hardcoded values is bad. But if only one place is the actual source of the information and another just reflects the same information though a different API, that should be okay.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"just use the metadata for pip" reads exactly the same to me.

@tacaswell TBF, from the Python packaging perspective, that is not “metadata for pip” by a standardized machine-readable information. That said, if a project is meant to be installed and the installed dist metadata contains a version, it's reasonable to threat it as the source for the most accurate information (even if it originated from some importables that could be accessed separately). From that perspective, it might be the source of truth in many contexts.

But it's rather tricky to explain/document this in an accessible manner. Hence, my suggestion to attempt visualizing these relationships with my diagram #1276 (comment) (or something close to it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if only one place is the actual source of the information and another just reflects the same information though a different API, that should be okay.

there should be one – and preferably only one – way to do it

the preferable way is the standard way

I respectfully, but completely disagree with this characterization.

__version__ is older, but it’s not standardized in a PEP, so it’s not the standard.

also putting that info into an import package doesn’t make sense, as the versioned unit is the distribution package – i.e. the thing that has the dist-info directory in it that contains the standard metadata. most distribution packages might come with one main import package (e.g. Pillow contains the import package PIL), but that’s not a rule, so the only place that will reliably match 1:1 with versioned units is the distribution package, complete with its metadata.

@webknjaz
Copy link
Member

@ChrisBarker-NOAA could you try working my diagram from #1276 (comment) into this?

P.S. Looks like there's a merge conflict to be solved.

ChrisBarker-NOAA and others added 8 commits July 30, 2024 11:14
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <[email protected]>
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <[email protected]>
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <[email protected]>
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <[email protected]>
Co-authored-by: Sviatoslav Sydorenko (Святослав Сидоренко) <[email protected]>
@ChrisBarker-NOAA
Copy link
Contributor Author

OK all -- I think I've addressed everyone's comments.

Also -- please see my updated initial comment -- trying to keep the scope of this very narrow.

source/single_source_version.rst Outdated Show resolved Hide resolved
source/single_source_version.rst Show resolved Hide resolved
source/single_source_version.rst Outdated Show resolved Hide resolved
@ChrisBarker-NOAA
Copy link
Contributor Author

@ChrisBarker-NOAA could you try working my diagram from #1276 (comment) into this?

Hmm -- I'm trying to keep this page as simple as possible and not get into too many tech details.

But if the Doc Editors (who are they?) think it should be included, then sure.

@ChrisBarker-NOAA
Copy link
Contributor Author

By the way, I have not idea what to do with this:

The required .readthedocs.yaml configuration file was not found at repository's root. Learn how to use this file in our configuration file tutorial.

But I suppose that explains the lack of a preview.

@webknjaz
Copy link
Member

By the way, I have not idea what to do with this:

The required .readthedocs.yaml configuration file was not found at repository's root. Learn how to use this file in our configuration file tutorial.

But I suppose that explains the lack of a preview.

I believe RTD would build against the merge commit of your PR branch into main. Since there's merge conflicts, there's no merge commit, hence no place to check the config from.

Action item: address the merge conflicts (I personally like rebasing) and it should work then.

@ChrisBarker-NOAA ChrisBarker-NOAA deleted the simplify_single_source branch August 1, 2024 00:00
@ChrisBarker-NOAA
Copy link
Contributor Author

Turns out I had not synced my fork before doing this, and it was a serious mess, due to the restructuring of the docs. So I wan't able to fix it in place.

But the discussion had gotten pretty convoluted anyway, so a new PR has its benifits:

#1580

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants