Skip to content

Commit

Permalink
PEP 753: Add suggested human-readable labels (#3974)
Browse files Browse the repository at this point in the history
Signed-off-by: William Woodruff <[email protected]>
Co-authored-by: Adam Turner <[email protected]>
  • Loading branch information
woodruffw and AA-Turner authored Sep 23, 2024
1 parent c1310b7 commit 0b91fc8
Showing 1 changed file with 68 additions and 39 deletions.
107 changes: 68 additions & 39 deletions peps/pep-0753.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,85 +153,114 @@ informal relationship between ``Home-page``, ``Download-URL``, and their

This formalization has two parts:

1. A set of rules for canonicalizing ``Project-URL`` labels;
2. A set of "well-known" canonical label values that indices may specialize
1. A set of rules for normalizing ``Project-URL`` labels;
2. A set of "well-known" normalized label values that indices may specialize
URL presentation for.

Label canonicalization
----------------------
Label normalization
-------------------

The core metadata specification stipulates that ``Project-URL`` labels are
free text, limited to 32 characters.

This PEP proposes adding the concept of a "canonicalized" label to the core
metadata specification. Label canonicalization is defined via the following
This PEP proposes adding the concept of a "normalized" label to the core
metadata specification. Label normalization is defined via the following
Python function:

.. code-block:: python
import string
def canonicalize_label(label: str) -> str:
def normalize_label(label: str) -> str:
chars_to_remove = string.punctuation + string.whitespace
removal_map = str.maketrans("", "", chars_to_remove)
return label.translate(removal_map).lower()
In plain language: a label is *canonicalized* by deleting all ASCII punctuation and
In plain language: a label is *normalized* by deleting all ASCII punctuation and
whitespace, and then converting the result to lowercase.

The following table shows examples of labels before (raw) and after
canonicalization:
normalization:

.. csv-table::
:header: "Raw", "Canonicalized"
:header: "Raw", "Normalized"

"``Homepage``", "``homepage``"
"``Home-page``", "``homepage``"
"``Home page``", "``homepage``"
"``Change_Log``", "``changelog``"
"``What's New?``", "``whatsnew``"

Metadata producers **SHOULD** emit the canonicalized form of a user
specified label, but **MAY** choose to emit the un-canonicalized form so
Metadata producers **SHOULD** emit the normalized form of a user
specified label, but **MAY** choose to emit the un-normalized form so
long as it adheres to the existing 32 character constraint.

Package indices **SHOULD NOT** use the canonicalized labels belonging to the set
Package indices **SHOULD NOT** use the normalized labels belonging to the set
of well-known labels directly as UI elements (instead replacing them with
appropriately capitalized text labels). Labels not belonging to the well-known
set **MAY** be used directly as UI elements.

Well-known labels
-----------------

In addition to the canonicalization rules above, this PEP proposes a
In addition to the normalization rules above, this PEP proposes a
fixed (but extensible) set of "well-known" ``Project-URL`` labels,
as well as equivalent aliases.

The following table lists these labels, in canonical form:

.. csv-table::
:header: "Label", "Description", "Aliases"
:widths: 20, 50, 30

"``homepage``", "The project's home page", "*(none)*"
"``download``", "A download URL for the current distribution, equivalent to ``Download-URL``", "*(none)*"
"``changelog``", "The project's changelog", "``changes``, ``releasenotes``, ``whatsnew``, ``history``"
"``documentation``", "The project's online documentation", "``docs``"
"``issues``", "The project's bug tracker", "``bugs``, ``issue``, ``bug``, ``tracker``, ``report``"
"``sponsor``", "Sponsoring information", "``funding``, ``donate``, ``donation``"
as well as aliases and human-readable equivalents.

The following table lists these labels, in normalized form:

.. list-table::
:header-rows: 1

* - Label (Human-readable equivalent)
- Description
- Aliases
* - ``homepage`` (Homepage)
- The project's home page
- *(none)*
* - ``source`` (Source Code)
- The project's hosted source code or repository
- ``repository``, ``sourcecode``, ``github``
* - ``download`` (Download)
- A download URL for the current distribution, equivalent to ``Download-URL``
- *(none)*
* - ``changelog`` (Changelog)
- The project's comprehensive changelog
- ``changes``, ``whatsnew``, ``history``
* - ``releasenotes`` (Release Notes)
- The project's curated release notes
- *(none)*
* - ``documentation`` (Documentation)
- The project's online documentation
- ``docs``
* - ``issues`` (Issue Tracker)
- The project's bug tracker
- ``bugs``, ``issue``, ``tracker``, ``issuetracker``, ``bugtracker``
* - ``funding`` (Funding)
- Funding Information
- ``sponsor``, ``donate``, ``donation``

Indices **MAY** choose to use the human-readable equivalents suggested above
in their UI elements, if appropriate. Alternatively, indices **MAY** choose
their own appropriate human-readable equivalents for UI elements.

Packagers and metadata producers **MAY** choose to use these well-known
labels to communicate specific URL intents to package indices and downstreams.
labels or their aliases to communicate specific URL intents to package indices
and downstreams.

Packagers and metadata producers **SHOULD** produce the canonicalized version
of the well-known labels in package metadata.
Packagers and metadata producers **SHOULD** produce the normalized version
of the well-known labels or their aliases in package metadata. Packaging tools
**MUST NOT** transform between equivalent aliases, i.e.. **SHOULD**
normalize ``GitHub`` to ``github`` but **MUST NOT** transform
``github`` to ``source``.

Similarly, indices **MAY** choose to specialize their rendering or presentation
of URLs with these labels, e.g. by presenting an appropriate icon or tooltip
for each label.

Indices **MAY** also specialize the rendering or presentation of additional labels or URLs,
including (but not limited to), labels that start with a well-known label, and URLs that refer
to a known service provider domain (e.g. for documentation hosting or issue tracking).
Indices **MAY** also specialize the rendering or presentation of additional
labels or URLs, including (but not limited to), labels that start with a
well-known label, and URLs that refer to a known service provider domain (e.g.
for documentation hosting or issue tracking).

This PEP recognizes that the list of well-known labels is unlikely to remain
static, and that subsequent additions to it should not require the overhead
Expand Down Expand Up @@ -275,11 +304,11 @@ the core metadata standards:
next major core metadata version. If removed, package indices and consumers
**MUST** reject metadata containing these fields when said metadata is of
the new major version.
* Enforcement of label canonicalization. If enforced, package producers
**MUST** emit only canonicalized ``Project-URL`` labels when generating
* Enforcement of label normalization. If enforced, package producers
**MUST** emit only normalized ``Project-URL`` labels when generating
distribution metadata, and package indices and consumers **MUST** reject
distributions containing non-canonicalized labels. Note: requiring
canonicalization merely restricts labels to lowercase text, and excludes
distributions containing non-normalized labels. Note: requiring
normalization merely restricts labels to lowercase text, and excludes
whitespace and punctuation. It does NOT restrict project URLs solely to
the use of "well-known" labels.

Expand All @@ -292,7 +321,7 @@ Security Implications

This PEP does not identify any positive or negative security implications
associated with deprecating ``Home-page`` and ``Download-URL`` or with
label canonicalization.
label normalization.

How To Teach This
=================
Expand Down

0 comments on commit 0b91fc8

Please sign in to comment.