Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PEP 625 (Filename of a Source Distribution) #12245

Open
di opened this issue Sep 21, 2022 · 29 comments
Open

Support PEP 625 (Filename of a Source Distribution) #12245

di opened this issue Sep 21, 2022 · 29 comments
Labels
blocked Issues we can't or shouldn't get to yet feature request

Comments

@di
Copy link
Member

di commented Sep 21, 2022

What's the problem this feature will solve?
PEP 625 has been accepted, PyPI should be updated to support the PEP.

Describe the solution you'd like
PyPI needs to implement some changes to support the PEP:

  • a restriction on the filenames considered valid for a source distribution (and a corresponding deprecation/notification)
  • validation of the distribution and version sections of the filename, including normalization.
@di
Copy link
Member Author

di commented Sep 21, 2022

This is likely blocked on pypa/packaging#527.

@di
Copy link
Member Author

di commented Jun 26, 2023

This is probably also blocked on finding a good migration path that isn't too obtrusive for users. Right now, there are a lot of files being uploaded that would fail to upload once this change is implemented, and IMO this would break too many users for us to enable this right now:

warehouse=> select DATE_TRUNC('day', upload_time) as day, count(filename) from release_files where packagetype = 'sdist' and filename ilike '%-%-%' group by DATE_TRUNC('day', upload_time) order by day desc limit 30;
         day         | count
---------------------+-------
 2023-06-26 00:00:00 |  1028
 2023-06-25 00:00:00 |   381
 2023-06-24 00:00:00 |   687
 2023-06-23 00:00:00 |  1093
 2023-06-22 00:00:00 |  1453
 2023-06-21 00:00:00 |  1486
 2023-06-20 00:00:00 |  1606
 2023-06-19 00:00:00 |  1200
 2023-06-18 00:00:00 |   354
 2023-06-17 00:00:00 |   723
 2023-06-16 00:00:00 |  1455
 2023-06-15 00:00:00 |  1161
 2023-06-14 00:00:00 |  1567
 2023-06-13 00:00:00 |  1557
 2023-06-12 00:00:00 |  1157
 2023-06-11 00:00:00 |   358
 2023-06-10 00:00:00 |   693
 2023-06-09 00:00:00 |  1327
 2023-06-08 00:00:00 |  1958
 2023-06-07 00:00:00 |  1631
 2023-06-06 00:00:00 |  1430
 2023-06-05 00:00:00 |  1116
 2023-06-04 00:00:00 |   325
 2023-06-03 00:00:00 |   783
 2023-06-02 00:00:00 |  1327
 2023-06-01 00:00:00 |  1710
 2023-05-31 00:00:00 |  1693
 2023-05-30 00:00:00 |  1109
 2023-05-29 00:00:00 |   959
 2023-05-28 00:00:00 |   387
(30 rows)

I think a good migration path would be:

  • ensuring the most popular build tools have supported outputting PEP 625-compliant filenames for some sufficiently long period of time
  • perhaps making upload tools like twine silently normalize this at upload time, possibly with a warning?

@dstufft
Copy link
Member

dstufft commented Jun 26, 2023

The other thing we could do, is forcibly normalize ourselves, though we hadn't done that in the past and I know that would break at least twine's checks if a file has been uploaded already.

@di
Copy link
Member Author

di commented Jul 18, 2023

Blocked on #14156 as well.

@di di added the blocked Issues we can't or shouldn't get to yet label Jul 18, 2023
@stiankri
Copy link

Until the version in the sdist filename is verified as described in this issue, it's possible to create multiple sdists per release (as seen from the filename point of view).

Example: upload foo-1.tar.gz with release 1 and foo-1.zip with release 1.1. From the metadata point of view it's still just one sdist per release, but from the Simple API point of view (which the package managers use) there are two sdists for release 1.

This does not seem to be in the spirit of PEP 527:

[T]his PEP proposes to allow one, and only one, sdist per release of a project.

Which is currently verified on upload.

@di
Copy link
Member Author

di commented Mar 20, 2024

Probably also blocked on pypa/setuptools#3593 as the predominant builder of source distributions.

@di
Copy link
Member Author

di commented Mar 20, 2024

There doesn't seem to be any real progress here towards builders producing normalized source distribution filenames:

warehouse=> SELECT DATE_TRUNC('month', upload_time) AS month, COUNT(filename)
FROM release_files
WHERE packagetype = 'sdist'
    AND filename ILIKE '%-%-%'
    AND upload_time >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL '30 months'
GROUP BY DATE_TRUNC('month', upload_time)
ORDER BY month DESC;
        month        | count
---------------------+-------
 2024-03-01 00:00:00 | 23683
 2024-02-01 00:00:00 | 31742
 2024-01-01 00:00:00 | 33589
 2023-12-01 00:00:00 | 33818
 2023-11-01 00:00:00 | 35584
 2023-10-01 00:00:00 | 32092
 2023-09-01 00:00:00 | 33117
 2023-08-01 00:00:00 | 38100
 2023-07-01 00:00:00 | 34178
 2023-06-01 00:00:00 | 35241
 2023-05-01 00:00:00 | 35136
 2023-04-01 00:00:00 | 32816
 2023-03-01 00:00:00 | 39726
 2023-02-01 00:00:00 | 34714
 2023-01-01 00:00:00 | 32340
 2022-12-01 00:00:00 | 26588
 2022-11-01 00:00:00 | 29160
 2022-10-01 00:00:00 | 27748
 2022-09-01 00:00:00 | 30693
 2022-08-01 00:00:00 | 35739
 2022-07-01 00:00:00 | 30297
 2022-06-01 00:00:00 | 31412
 2022-05-01 00:00:00 | 35092
 2022-04-01 00:00:00 | 29901
 2022-03-01 00:00:00 | 33199
 2022-02-01 00:00:00 | 27257
 2022-01-01 00:00:00 | 28129
 2021-12-01 00:00:00 | 27028
 2021-11-01 00:00:00 | 30112
 2021-10-01 00:00:00 | 30402
 2021-09-01 00:00:00 | 28612
(31 rows)

chart

@dimbleby
Copy link

pypa/setuptools#3593 has been closed (implemented) for a little while now, would be interesting to see if it is yet making a dent in that graph

@di
Copy link
Member Author

di commented Aug 7, 2024

Indeed, quite a nice drop:

chart (3)

At this rate, we should be low enough in another month or two to start emitting warnings about a deprecation, and probably by EOY we could fully support PEP 625.

@di
Copy link
Member Author

di commented Nov 6, 2024

And here's the latest:

Image

Less than 10K uploads in October.

@di
Copy link
Member Author

di commented Nov 18, 2024

PR to start warning uploaders of non-PEP 625 compliant filenames is here: #17110

@bmispelon
Copy link

Hi,

This change is creating some confusion in the Django project (unsurprisingly, considering the non-standard capitalization of our package and build files historically), especially around the timeline and extent of the deprecation: https://code.djangoproject.com/ticket/35980.
Our current solution has been to pin setuptools but that's not a solution we're keen to keep in the long term.

Could you clarify what you meant in the comment above with "[...] probably by EOY we could fully support PEP 625."? Does that mean that non-pep625 build files could start being rejected, or is that just a prediction about the progress towards the goal of all newly uploaded build files being pep625-compliant?

The warning email from PyPI that Django received with our latest release also mentions that "In the future, PyPI will require all newly uploaded source distribution filenames to comply with PEP 625.". Is there a clearer timeline for this? Will there be a mechanism where maintainers will be given advance notice of that change, or is there a specific GH issue one could subscribe to for example?
Django has a major version set to be released in April, and it would be helpful for our maintainers to know whether we need to audit/fix our various release scripts and documentation ahead of that date.

Thanks for all the work you do on PyPI, we're big fans 🎉

@di
Copy link
Member Author

di commented Dec 9, 2024

Hi @bmispelon, thanks for your comment here, sorry this is causing confusion for the Django project.

Could you clarify what you meant in the #12245 (comment) with "[...] probably by EOY we could fully support PEP 625."? Does that mean that non-pep625 build files could start being rejected, or is that just a prediction about the progress towards the goal of all newly uploaded build files being pep625-compliant?

By "fully support PEP 625" I meant that PyPI would be in compliance with PEP 625, and reject uploads of filenames that are invalid per PEP 625. I think it's unlikely that we actually will do this by EOY, though.

Is there a clearer timeline for this? Will there be a mechanism where maintainers will be given advance notice of that change, or is there a specific GH issue one could subscribe to for example?

We don't currently have a timeline other than "eventually". If it's helpful, I can commit to having the warning emails include a clear deadline at a minimum of 6 months before said deadline, and share that deadline on this issue as well. Would that suffice? That also means this wouldn't be enforced before Django's release in April.

FWIW, I think this change should be unnecessary -- the actual project name shouldn't need to change, just the underlying build tooling that is producing the source distribution. In this case, upgrading setuptools should be all that is necessary (aside from project-specific changes to handle the new filename, of course).

@konstin
Copy link
Contributor

konstin commented Dec 9, 2024

To add more context to this, the project name is used in different places with different normalizations: There is the human-readable name, and there is the dist-info name (re.sub(r"[-_.]+", "_", name).lower()). Note that the dist-info normalization is different from the regular package name normalization (re.sub(r"[-_.]+", "-", name).lower()) to avoid a dash in a name that is delimited by a dash.

  • pyproject.toml project.name: human-readable name ("Tools SHOULD normalize this name, as soon as it is read for internal consistency.")
  • METADATA Name: Same as pyproject.toml project.name
  • Source distribution filenames: dist-info name
  • Wheel filename: dist-info name
  • The top level <name>-<version> directory inside a source distribution: dist-info name. This isn't explicitly called out in the spec, but it uses {name}-{version} for both filename and top level directory. It doesn't impact tooling as there is only one top level directory anyway.
  • <name>-<version>.dist-info and <name>-<version>.data directories inside the wheel: dist-info name. This isn't explicitly called out in the spec, but it uses {distribution}-{version} for both filename and .dist-info names and unnormalized names would cause ambiguity after installation.

Afaik the only blocker here is pypa/setuptools#3777

@dimbleby
Copy link

I do not think that even pypa/setuptools#3777 is a blocker here, though it might take a slightly pedantic reading to agree with this.

As I understand it, this issue is specifically about PEP 625 and source distribution names: so though it seems to be true that setuptools is producing wheels whose naming is not PEP-491 compliant - that would be out of scope.

There probably should be an analogous issue here for wheels: "support [enforce] PEP491". But I think that is not what this issue says, and I guess it is unlikely that warehouse would insist on PEP-491 while setuptools produces non-compliant wheels.

@nessita
Copy link

nessita commented Dec 10, 2024

I do not think that even pypa/setuptools#3777 is a blocker here, though it might take a slightly pedantic reading to agree with this.

As I understand it, this issue is specifically about PEP 625 and source distribution names: so though it seems to be true that setuptools is producing wheels whose naming is not PEP-491 compliant - that would be out of scope.

There probably should be an analogous issue here for wheels: "support [enforce] PEP491". But I think that is not what this issue says, and I guess it is unlikely that warehouse would insist on PEP-491 while setuptools produces non-compliant wheels.

Hi! This is Natalia, one of the current Django Fellows. Thank you David for your feedback!

For Django, the release process uses a unified workflow that relies on setuptools to generate both the tarball and the wheel. However, the issue with setuptools is that it produces a lowercase django tarball and a capitalized Django wheel. This inconsistency complicates our tooling, which must handle both naming formats. Ideally, we would have a consistent naming convention across both formats to avoid this problem.

As a workaround, we’ve been pinning setuptools>=61.0.0,<69.3.0 to produce consistently named Django tarballs and wheels. However, this triggers the current deprecation warning when uploading to PyPI, and for Django, the inconsistent wheel naming is a blocker to releasing tarballs with the expected naming format.

@dimbleby
Copy link

Ah, then I guess pypa/setuptools#3777 is a blocker specifically for django - but still not, strictly speaking, a blocker here...

setuptools sounds open to pull requests fixing this and I assume that it must be not very difficult once the relevant bit of code is identified, perhaps you are motivated to make that happen.

@nessita
Copy link

nessita commented Dec 10, 2024

Ah, then I guess pypa/setuptools#3777 is a blocker specifically for django - but still not, strictly speaking, a blocker here...

Ish :-)

I believe this inconsistency (which goes beyond just name casing) could affect more packages than just Django, for similar reasons. It’s problematic and likely complicates tooling to have release artifacts with varying casing or punctuation.

setuptools sounds open to pull requests fixing this and I assume that it must be not very difficult once the relevant bit of code is identified, perhaps you are motivated to make that happen.

👍

@zackees
Copy link

zackees commented Jan 31, 2025

I swear to god. Every 6 months every single one of my python packages breaks for publishing.

Do you even runs tests on your publishing pipeline? Every single publishing error I get is unbelievably mystifying. AI has no comprehension of what's going wrong.

I really hope the astral-sh guys come in here and fix this for you.

This is the 3rd breakage for a legacy package this week of a completely different.

Get some unit tests going. It's like you want the python ecosystem to just break constantly.

@di
Copy link
Member Author

di commented Jan 31, 2025

Hey @zackees, sorry you're having trouble here.

Our support for PEP 625 here shouldn't have resulted in any failure to publish, as we're currently only sending advisory emails when a user is attempting to publish a non-compliant source filename, as part of a long deprecation period. A non-compliant file should still upload with no issue, and if that's not the case, it's a bug and not intended.

If publishing currently isn't working for you, it'd be helpful to know the exact issue you're seeing that's preventing you from uploading.

@zackees
Copy link

zackees commented Jan 31, 2025

It's not just a specific error, most of my projects now fail, which used previous acceptable methods.

This is what unit tests are for. Just take a package and see if you can publish it. It's the simplest github action ever.

One of my most popular packages, transcribe-anything seems to have multiple issues now. The tool is complaining that i have two sdist. But here's the thing, it still publishes. It's just doesn't show up on your pypi index. It just fails on the next publish because the file is already there.

What is the number of unit tests that pypi is running?

This is the library i manage

https://github.com/fastled/fastled

Please, just add more tests for package install ability. I volunteer transcribe-anything, which used to install just fine. You fix this for me you fix it for many other people

@zackees
Copy link

zackees commented Jan 31, 2025

another issues include the entire repo being included in the tar.gz

I've explicitly declared the project files. Doesn't matter. Root is getting archived in the tar gz file - trying to include venv directories.

@miketheman
Copy link
Member

@zackees PyPI runs over 4000 tests as part of the test suite, you are more than welcome to examine any of them - see the tests/ subdirectory.

You still have yet to provide concrete context of your failure scenario, which makes providing any kind of support harder.

The tone you have adopted in your messages is also not very kind to others, and while recognizing that it can be frustrating to experience confusing failures when attempting to accomplish a task, there's no call for taking that out on others.

@zackees
Copy link

zackees commented Jan 31, 2025

Repro:

  • git clone https://github.com/zackees/transcribe-anything
  • cd transcribe-anything
  • ./install
  • ./upload_package.sh

@dimbleby
Copy link

setup.py upload has been unavailable since setuptools 42.0. Which was released in 2019.

This is a strange time and place to complain about it!

@zackees
Copy link

zackees commented Jan 31, 2025

It continued to work until jan 15th of this year. It still works in fact. My tar.gz uploads and is stored on the pypi server and prevents any other uploads with the same version from being upload.

However, the version never appears on the pypi server.

@di
Copy link
Member Author

di commented Jan 31, 2025

@zackees I see that version 2.3.0 was released on Jan 22nd: https://pypi.org/project/transcribe-anything/2.3.0/, is this the missing version you are talking about?

I also see that your project is currently set to version 2.3.7, which you already released on Oct 19, 2023: https://pypi.org/project/transcribe-anything/2.3.7/. Is this the missing version you are talking about?

I still don't see what this has to do with PEP 625 but please let us know how that's related here.

@zackees
Copy link

zackees commented Jan 31, 2025

Interesting, looks like a 2 got swapped for a 7 in a previous version. That may have been the reason why what I thought was the new versions weren't show up.

I re-wrote the project to use pyproject.toml and corrected the version, now the sbin issue isn't happening anymore.

Undesired directories like everything in tests, .github and other directories outside of src beingupload with the distribution despite explicit inclusions and exclusion attributes, but that's a lesser problem:

Image

pyproject.toml

[build-system]
requires = ["setuptools>=65.5.1", "setuptools-scm", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "transcribe-anything"
version = "2.8.0"  # Update this manually or configure setuptools-scm for automatic versioning
readme = "README.md"
description = "Uses Whisper AI to transcribe speech from video and audio files. Also accepts URLs for YouTube, Rumble, BitChute, clear file links, etc."
requires-python = ">=3.10"
keywords = ["transcribe", "openai", "whisper"]
license = { text = "BSD-3-Clause" }
dependencies = [
    "static-ffmpeg>=2.7",
    "yt-dlp>=2025.1.15",
    "appdirs>=1.4.4",
    "disklru>=1.0.7",
    "FileLock",
    "webvtt-py==0.4.6",
    "uv-iso-env>=1.0.33",
]

maintainers = [{ name = "Zachary Vorhies", email = "[email protected]" }]

[project.urls]
homepage = "https://github.com/zackees/transcribe-anything"

[tool.setuptools]
package-dir = {"" = "src"}

[tool.setuptools.packages.find]
where = ["src"]
include = ["transcribe_anything*"]
exclude = ["tests*", "docs*", "examples*"]
namespaces = false  # Prevent setuptools from looking elsewhere


[project.scripts]
transcribe_anything = "transcribe_anything._cmd:main"
transcribe-anything = "transcribe_anything._cmd:main"

[tool.ruff]
line-length = 200

[tool.pylint."MESSAGES CONTROL"]
good-names = ["c", "i", "ok", "id", "e", "f"]
disable = [
    "missing-function-docstring",
    "missing-module-docstring"
]

[tool.isort]
profile = "black"

[tool.mypy]
ignore_missing_imports = true
disable_error_code = ["import-untyped"]

[tool.black]
line-length = 200
target-version = ['py310']

@miketheman
Copy link
Member

@zackees You show usage of setuptools-scm, whose documentation notes that the way to exclude files that are SCM tracked is necessary via another mechanism.

None of the problems you've shared appear to be problems with PyPI or uploading files - these are all within your build tools and control.

Let's leave this issue for the folks that are discussing the topic of PEP 625 support on PyPI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Issues we can't or shouldn't get to yet feature request
Projects
None yet
Development

No branches or pull requests

9 participants