-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Subtree sync for rustc_codegen_cranelift #141557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…Lapkin Rename `is_like_osx` to `is_like_darwin` Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.). ``@rustbot`` label O-apple r? compiler
…youxu Run coretests and alloctests with cg_clif in CI Part of rust-lang/rustc_codegen_cranelift#1290
- src\doc\nomicon\src\ffi.md should also have its ABI list updated
This will show a backtrace. Also added a reference to rust-lang/rustc_codegen_cranelift#171 in the unimplemented intrinsic error message.
…ntrinsic_error Replace trap_unimplemented calls with codegen_panic_nounwind
In preparation for future unwinding support. Part of rust-lang/rustc_codegen_cranelift#1567
Once writing the LSDA, it will need access to the Module to get a reference to the personality function and to define a data object for the LSDA. Part of rust-lang/rustc_codegen_cranelift#1567
It bugs me when variables of type `Ident` are called `name`. It leads to silly things like `name.name`. `Ident` variables should be called `ident`, and `name` should be used for variables of type `Symbol`. This commit improves things by by doing `s/name/ident/` on a bunch of `Ident` variables. Not all of them, but a decent chunk.
Remove the use of Rayon iterators This removes the use of Rayon iterators and the use of the `rustc-rayon` crate. `rustc-rayon-core` is still used however. In parallel loops, instead of a Rayon iterator a serial iterator are used to collect items into a `Vec` and we use a parallel loop over its elements using the new `par_slice` function which is built on `rustc-rayon-core`'s `join`. This change makes it easier to bring `rustc-rayon-core` in-tree. Tests using 7 threads: <table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th><td align="right">Physical Memory</td><td align="right">Physical Memory</td><td align="right">%</th><td align="right">Committed Memory</td><td align="right">Committed Memory</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">0.4827s</td><td align="right">0.4828s</td><td align="right"> 0.02%</td><td align="right">201.23 MiB</td><td align="right">201.31 MiB</td><td align="right"> 0.04%</td><td align="right">279.03 MiB</td><td align="right">279.46 MiB</td><td align="right"> 0.15%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.1443s</td><td align="right">0.1401s</td><td align="right">💚 -2.91%</td><td align="right">126.42 MiB</td><td align="right">126.70 MiB</td><td align="right"> 0.22%</td><td align="right">199.79 MiB</td><td align="right">199.99 MiB</td><td align="right"> 0.10%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">0.3252s</td><td align="right">0.3065s</td><td align="right">💚 -5.78%</td><td align="right">161.87 MiB</td><td align="right">161.78 MiB</td><td align="right"> -0.05%</td><td align="right">229.59 MiB</td><td align="right">230.23 MiB</td><td align="right"> 0.28%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">0.5845s</td><td align="right">0.5876s</td><td align="right"> 0.53%</td><td align="right">197.01 MiB</td><td align="right">196.89 MiB</td><td align="right"> -0.06%</td><td align="right">267.62 MiB</td><td align="right">267.47 MiB</td><td align="right"> -0.06%</td></tr><tr><td>Total</td><td align="right">1.5367s</td><td align="right">1.5169s</td><td align="right">💚 -1.29%</td><td align="right">686.53 MiB</td><td align="right">686.68 MiB</td><td align="right"> 0.02%</td><td align="right">976.04 MiB</td><td align="right">977.14 MiB</td><td align="right"> 0.11%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9796s</td><td align="right">💚 -2.04%</td><td align="right">1 byte</td><td align="right">1.00 bytes</td><td align="right"> 0.04%</td><td align="right">1 byte</td><td align="right">1.00 bytes</td><td align="right"> 0.12%</td></tr></table> <table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th><td align="right">Physical Memory</td><td align="right">Physical Memory</td><td align="right">%</th><td align="right">Committed Memory</td><td align="right">Committed Memory</td><td align="right">%</th></tr><tr><td>🟠 <b>clap</b>:debug</td><td align="right">1.6371s</td><td align="right">1.6529s</td><td align="right"> 0.96%</td><td align="right">395.58 MiB</td><td align="right">396.21 MiB</td><td align="right"> 0.16%</td><td align="right">460.98 MiB</td><td align="right">461.52 MiB</td><td align="right"> 0.12%</td></tr><tr><td>🟠 <b>hyper</b>:debug</td><td align="right">0.3248s</td><td align="right">0.3210s</td><td align="right">💚 -1.16%</td><td align="right">155.16 MiB</td><td align="right">155.19 MiB</td><td align="right"> 0.02%</td><td align="right">219.21 MiB</td><td align="right">219.30 MiB</td><td align="right"> 0.04%</td></tr><tr><td>🟠 <b>regex</b>:debug</td><td align="right">1.0148s</td><td align="right">0.9929s</td><td align="right">💚 -2.16%</td><td align="right">297.96 MiB</td><td align="right">295.07 MiB</td><td align="right"> -0.97%</td><td align="right">354.53 MiB</td><td align="right">351.58 MiB</td><td align="right"> -0.83%</td></tr><tr><td>🟠 <b>syn</b>:debug</td><td align="right">1.3614s</td><td align="right">1.3717s</td><td align="right"> 0.76%</td><td align="right">319.10 MiB</td><td align="right">321.19 MiB</td><td align="right"> 0.65%</td><td align="right">378.90 MiB</td><td align="right">381.27 MiB</td><td align="right"> 0.62%</td></tr><tr><td>Total</td><td align="right">4.3381s</td><td align="right">4.3386s</td><td align="right"> 0.01%</td><td align="right">1.14 GiB</td><td align="right">1.14 GiB</td><td align="right"> -0.01%</td><td align="right">1.38 GiB</td><td align="right">1.38 GiB</td><td align="right"> 0.00%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9960s</td><td align="right"> -0.40%</td><td align="right">1 byte</td><td align="right">1.00 bytes</td><td align="right"> -0.03%</td><td align="right">1 byte</td><td align="right">1.00 bytes</td><td align="right"> -0.01%</td></tr></table>
Prepend temp files with per-invocation random string to avoid temp filename conflicts rust-lang#139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management. Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name: ``` #[inline(never)] #[cfg(any(rpass1, rpass3))] fn a() -> i32 { 0 } #[cfg(any(cfail2))] fn a() -> i32 { 1 } fn main() { evil::evil(); assert_eq!(a(), 0); } mod evil { #[cfg(any(rpass1, rpass3))] pub fn evil() { unsafe { std::arch::asm!("/* */"); } } #[cfg(any(cfail2))] pub fn evil() { unsafe { std::arch::asm!("missing"); } } } ``` Session 1 (`rpass1`): * Type-check, borrow-check, etc. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd. * Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`. * Save-temps option means we don't delete `XXX.rgcu.o`. * Link the binary and stuff. * Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name). Session 2 (`cfail2`): * Load artifacts from the previous *finalized* incremental session, namely the dep graph. * Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o`. **HERE IS THE PROBLEM**: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory. * Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the *previous* session is the last finalized session. Session 3 (`rpass3`): * Load artifacts from the previous *finalized* incremental session, namely the dep graph. NOTE that this is from session 1. * All the dep graph nodes are green since we are basically replaying session 1. * codegen object file `XXX.o`, which is detected as *reused* from session 1 since dep nodes were green. That means we **reuse** `XXX.o` which had been dirtied from session 2. * Link the binary and stuff. This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1. At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst **unsound**. This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example). --- This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not *deterministic*, but temporary files are transient anyways, so I don't believe this is a problem. That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc. Fixes rust-lang#139407 [^1]: https://github.com/rust-lang/rust/blob/175dcc7773d65c1b1542c351392080f48c05799f/compiler/rustc_fs_util/src/lib.rs#L60 [^2]: https://github.com/rust-lang/rust/blob/175dcc7773d65c1b1542c351392080f48c05799f/compiler/rustc_incremental/src/persist/fs.rs#L1-L40
Add `f16`/`f128` support
|
The list of allowed third-party dependencies may have been modified! You must ensure that any new dependencies have compatible licenses before merging. |
@bors r+ p=1 subtree sync |
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 283db70 (parent) -> 9f8929f (this PR) Test differencesNo test diffs found Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 9f8929fbeca4b5c2302b326606ae800156915840 --output-dir test-dashboard And then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
Finished benchmarking commit (9f8929f): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary 9.5%, secondary 2.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 775.708s -> 776.208s (0.06%) |
The main highlights this time are a Cranelift update and (thanks for beetrees) f16/f128 support.
r? @ghost
@rustbot label +A-codegen +A-cranelift +T-compiler