-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Limit the size of cgu names when using the `-Zhuman-readable-cgu-name… #141558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
rustbot has assigned @matthewjasper. Use |
@@ -468,6 +469,20 @@ impl<'tcx> CodegenUnit<'tcx> { | |||
hash.as_u128().to_base_fixed_len(CASE_INSENSITIVE) | |||
} | |||
|
|||
pub fn shorten_name(human_readable_name: &str) -> Cow<'_, str> { | |||
// Set a limit a somewhat below the common platform limits for file names. | |||
const MAX_CGU_NAME_LENGTH: usize = 200; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something we can query from the OS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be less than the actual filename limit, since the extension and other bits are added later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even that aside, while on Windows we could use GetVolumeInformationByHandleW
to query the maximum component length for local filesystems, the answer for both NTFS and ReFs is always 255 so I don't really see it as worth it. And I think it's very unlikely that it'll change in the near future for API reasons (there are some APIs that have a fixed-sized buffer for component names).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Browsing the limits it looks like 255 is by far the most common amongst filesystems https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A cgu name of 200 characters will still cause errors on Windows if the path to the directory it is in is 60 or more characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the below conversation, verbatim paths can be used to workaround that. IIRC llvm handles long paths now and rustc does by virtue of using std APIs.
let mangled_name = Self::mangle_name(human_readable_name); | ||
let truncate_to = human_readable_name | ||
.floor_char_boundary(MAX_CGU_NAME_LENGTH - mangled_name.len() - 1); | ||
format!("{}-{}", &human_readable_name[..truncate_to], mangled_name).into() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fear that this can result in CGU collisions, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the mangled name is still included
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a Windows dev but wouldn't it make sense to "just" use extended-length paths on Windows (whose length limit is 216/2-1 = 32,767) — prefix is \\?\
. Docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extended paths seem like the better solution to me as well since that means the logic to name the file will be the same regardless of platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are confusing paths with filenames. The limit here is a filename limit and is a limitation of the file-system, not just the OS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's practical to determine the actual limit, since there are so many factors involved (file-system, OS, etc.) not to mention that when the object files are combined into an archive, the archive file format may have its own limit.
IMO it's much better to impose a simple limit that will work everywhere. Especially given that it only affects users of this niche feature, and only CGUs with abormally long filenames.
let mangled_name = Self::mangle_name(human_readable_name); | ||
let truncate_to = human_readable_name | ||
.floor_char_boundary(MAX_CGU_NAME_LENGTH - mangled_name.len() - 1); | ||
format!("{}-{}", &human_readable_name[..truncate_to], mangled_name).into() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truncating the name without any indication that it is truncated would be confusing IMO. Maybe you can use -Zprint-mono-items
instead of -Zhuman-readable-cgu-names
for your use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjorn3 Happy to add some indicator that truncation has occurred, but it doesn't make sense to just leave the feature broken... Most filesystems have a 256 character filename limit, so if you happen to get a CGU with a name longer than that it just won't compile with this option enabled.
I think you also need to consider just how long 200 characters is... It's very, very obvious when looking at the filenames that such a CGU has an unnaturally long filename compared to other CGUs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR to add an indicator when the filename was truncated.
…s` option Prior to this change, cgu names could be generated which would result in filenames longer than the limit imposed by the OS.
12d4fd7
to
fdb660e
Compare
…s` option
Prior to this change, cgu names could be generated which would result in filenames longer than the limit imposed by the OS.
Fixes #141547