Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixup of repository-relative URLs in READMEs #8434

Open
kornelski opened this issue Jul 1, 2020 · 2 comments
Open

Fixup of repository-relative URLs in READMEs #8434

kornelski opened this issue Jul 1, 2020 · 2 comments
Labels
A-readme Area: README file issues C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-publish S-propose-close Status: A team member has nominated this for closing, pending further input from the team S-triage Status: This issue is waiting on initial triage.

Comments

@kornelski
Copy link
Contributor

kornelski commented Jul 1, 2020

Technically, README files are not supposed to contain relative URLs, because the README file itself doesn't have its own base URL after being published.

In practice it's usually expected that relative URLs in READMEs will be rewritten the same way GitHub does it, but that: a) depends on an implementation detail of a proprietary service, b) is a complicated transformation:

https://users.rust-lang.org/t/psa-please-use-absolute-urls-in-crate-readmes/45136

crates-io already has some code to fix relative URLs in READMEs, but it has to assume the README lives at the root of the repository, and that the main branch is called master. It's very hard to do any better after the crate has been published.

The URLs are relative to the position of README inside the repository, but Cargo.toml doesn't contain that information. In monorepos with multiple crates the README may not be in the repository root. Getting that path after a crate is published requires cloning and searching the repository. Before publishing, Cargo could simply check the local git checkout.

I see a few of ways to improve this situation:

  • When packaging/publishing, scan README markdown for relative URLs and warn users that relative URLs may not be resolved properly. Users should make these URLs absolute. To reduce noise, the warning could be shown only in non-trivial configuration that crates-io doesn't support (e.g. when the README isn't in the root of the repository).

  • Rewrite README markdown and change relative URLs to absolute automatically. When publishing, Cargo has access to the local checkout, from which it can learn the commit hash, in-repo paths, and repo URLs to do rewriting well. An unmodified README could be kept as README.orig, similarly how Cargo.toml is rewritten. URL schemes are dependent on code hosting provider, so Cargo would have to have knowledge of GitHub/GitLab/etc. URL schemes. This is hard to avoid: if not Cargo, then crates-io, lib.rs, libraries.io and other places will have to do this. Cargo could still warn about unknown code hosts. I already have a crate that does this rewriting. I could make it usable in Cargo.

  • Define and enforce that relative URLs in the README must only refer to files included inside the crate tarball (so can't use something like ../../assets/logo.png), and make crates-io rewrite URLs to use files from crate tarball rather than github.com. That would be ideal in terms of independence from 3rd party proprietary services and longevity of crate tarballs. However, crates-io (and similar sites) would have to be careful to sanitize files from the crate to avoid XSS.

@kornelski kornelski added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Jul 1, 2020
@ehuss ehuss added A-readme Area: README file issues Command-publish labels Jul 1, 2020
@nipunn1313
Copy link
Contributor

nipunn1313 commented Aug 30, 2021

Hi. I implemented #9837 before coming across this

In #9837 - I proposed updating readme_file to be a repo-relative path (a breaking change) in tandem with
rust-lang/crates.io#3861 would get crates.io to rewrite the relative links w/o making assumptions of the README being at root.

I would note a few points in my research

  • There are (non-monorepo) use cases for README pointing out of a crate - when multiple related crates are in a repo. For example here futures-rs - allowing the repo README to match the crates.io readme for futures
  • There are monorepo use cases where you might want a separate README for each independent crate. I used one for a while at Dropbox.

One big question is whose responsibility it is to rewrite links in README

Currently, registries own url rewriting, and have some mild varying degree of knowledge of popular Github/Gitlab hosting services.

Move ownership to cargo

Moving that logic into Cargo would have the advantage of centralizing that logic in one place - however it would push more burden onto the publish step of cargo (client side software). Currently Cargo isn't in the business of poking around with contents of readmes. It seems like a sensible option, though would take a bit of effort.

Keep ownership of rewriting in registry

Cargo would need to provide more information to the registry to do the rewriting. Notably the path to the crate within the repo. Could be placed as extra upload metadata alongside readme_file, or in cargo_vcs_info.json.
We could have the registries all use the crate @kornelski has available - in order to centralize the logic at least.

@epage
Copy link
Contributor

epage commented Dec 20, 2024

crates.io has an issue on their side for this: rust-lang/crates.io#9939

As crates.io does the rewriting, I would prefer to leave any more rewriting to them though they may benefit from features from us to help.

Personally, I don't want to get into markdown parsing at this time.

I propose we close this in favor of crates.io's issue.

@epage epage added the S-propose-close Status: A team member has nominated this for closing, pending further input from the team label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-readme Area: README file issues C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-publish S-propose-close Status: A team member has nominated this for closing, pending further input from the team S-triage Status: This issue is waiting on initial triage.
Projects
None yet
Development

No branches or pull requests

4 participants