-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for consistent dependency package behavior across platforms #18728
Comments
I do not like renaming. It treats the symptom not the "disease". Renaming means that one has to chase To avoid data loss on case-insensitive systems like MacOS a conforming package should have no file or directory name collisions when compared case-insensitive. Package hashing tooling should refuse to generate hash if collisions are found. Otherwise hash is generated using canonical low-case converted filenames instead of real ones. The order of files/directories in hashing operation is lexicographical by low-case names. This guarantees that the package hash is reproducible on all file systems whether case-sensitive or not. Authors of packages with files/directories that differ only by case of letters have to manually rename the offenders which is a one-time hassle. And IMHO it also makes package structure more readable. A situation that two files override each other on MacOS is avoided. |
I have only a few cherry-picked usages to refer to. Mostly the upstream package being consumed is not maintained by the developer and there is little or no incentive for upstream to correct any issues encountered when a developer attempts to import the tarball directly from upstream. That said, I might be completely off about how Zig's build system consumes packages. Maybe the intention is never to consume from upstream and the developer should always repackage upstream into something with a zig build manifest. In that case I 100% agree with you, all packages should conform to avoid issues with naming conflicts and symlinks and inconsistent hashing. Concerning the situation where two files conflict with each other on MacOS and Windows, I'm fully expecting Zig to error in the near future. I am currently looking into making the change andrewrk described in #18089 (comment) while this proposal cooks:
|
Something I'm trying very hard not to do when I approach this problem is enforce the developer (or upstream) to do anything. I'm considering enforcement scope creep for my purpose: consistent behavior across platforms when referencing a package. That package might be another Zig project, a tarball, or a git repository. Rant warning... I'm keeping away from enforcement for a second reason though. I do not want to be the one that destroys what makes certain OSes great for the purpose of conformity. My opinion and bias:
Snapping to the least common denominator is a way to solve problems. Not usually the best way if perf is a top goal. I'm proposing to implement the least common denominator in But I also don't want to push my immature and limited view of the problem onto the Zig project. I do not know enough and very likely made some critical mistakes in how I'm seeing things. If I see areas I can contribute that push Zig in the direction it wants to go and also aligns with my perspective of what I want Zig to become, I think that's a win-win. But enforcement and reporting, specifically enforcement and reporting on symlinks, does not align with my current perspective. Linting and enforcing on naming conflicts is something that I can get behind and support. And maybe it deserves to be an item in this proposal. But its my proposal and I'm going to pull the defer card for being out of scope. I think its better saved for when its easier for a developer to repackage an offending upstream source with an assumption that upstream cannot (or refuses) to make the change. I don't like to be held hostage by upstream anymore than Zig likes to force itself on others. |
I've already outlined how this is to be solved:
|
Problem
Issues #17652 and #18089 where package contents causes failures or inconsistent hashes.
Summarizing the above linked issues: symlinks cause AccessDenied failures on Windows and the package hash on Linux may differ from MacOS and Windows because the latter are case-insensitive with their filenames.
Proposal
This proposal is to provide build.zig.zon functionality developers can use to resolve simple issues encountered when building on certain OSes. Current options include 1) forking the upstream package to resolve the issues or implicitly 2) for a package to limit its OS support.
It does not address the need for a package linter. Reporting and/or enforcing developers to avoid known issues (i.e. linting) is out of scope of this proposal.
New fields being proposed in build.zig.zon dependencies:
renames
as described in package hash differs between operating systems due to file system case sensitivity #18089 (comment) by marler8997.exclude_paths
is a list of paths likepaths
, but to exclude for the purpose of resolving conflicts.copy_symlink_target_max_depth
defaults to 0 to imply "fail on OSes that do not support symlinks instead of copying" (behavior today) and a positive integer for how deep to follow symlinks. Any symlinks still remaining after the max depth has been reached would be implicitly excluded. This operation would be applied to all OSes when a positive integer is provided to ensure package hashes match.Case for
renames
field: Despite a brief review of conflicts in tar files showing the contents are duplicate, it is very possible to have different contents and thus result in information loss when using just an include/exclude strategy. Support for renaming avoids the data loss.Case for
exclude_paths
field: Primarily to remove symlinks, but could be used to remove duplicate files that exist because upstream wrote the file to the package again but with different casing. Also could be used (and abused?) by developers as an optimization to remove unnecessary files. This is seen from the perspective that not all dependencies have package manifests. And those that do have package manifests might include symlinks and naming conflicts.Case for
copy_symlink_target_max_depth
field: Can be used to resolve simple symlinks issues for OSes like Windows that cannot be resolved withexclude_paths
.Misc Thoughts
Why
exclude_paths
instead of reusing the existingpaths
field?paths
is purely an inclusion list. When the developer only wants to exclude a file or two that's a symlink or causing a naming conflict, they have to specify all sibling files and directories. This could be dozens or hundreds of files they must specify.The gitignore pattern format was considered but might be over-engineering the issue
exclude_paths
is looking to solve. Packaging issues are expected to be rare. Ideally there's a small number of files to remove if any. Aside the complexity add the gitignore pattern format would introduce, any upstream package with many many issues may not benefit from the feature if the paths could not be globbed together.The text was updated successfully, but these errors were encountered: