-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pow matrix: remove unnecessary conditions in Matrix::compute_rank #517
base: master
Are you sure you want to change the base?
Conversation
According to bench, I'm getting the following performance boost:
|
Hey @michaelsutton sorry to ping you, would mind giving me a review on this? |
I think the burden of proof is on you here. Why are the changes suggested in 2f2b8a6 equivalent to current code? you do realize this is a consensus-critical path |
Of course, sorry for not providing more context I thought the code change was obvious enough. Well, About the
|
// SAFETY: An uninitialized MaybeUninit is always safe. | ||
let mut out: [[MaybeUninit<f64>; 64]; 64] = unsafe { MaybeUninit::uninit().assume_init() }; | ||
let mut out: [[f64; 64]; 64] = [[Default::default(); 64]; 64]; | ||
|
||
out.iter_mut().zip(self.0.iter()).for_each(|(out_row, mat_row)| { | ||
out_row.iter_mut().zip(mat_row).for_each(|(out_element, &element)| { | ||
out_element.write(f64::from(element)); | ||
*out_element = f64::from(element); | ||
}) | ||
}); | ||
// SAFETY: The loop above wrote into all indexes. | ||
unsafe { std::mem::transmute(out) } | ||
|
||
out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be much slower, this one seems to optimize as well as the original:
std::array::from_fn(|i| std::array::from_fn(|j| f64::from(self.0[i][j])))
See profiling(llvm-mca) and assembly here: https://godbolt.org/z/q5aj7chn6
if i >= 64 { | ||
// Required for optimization, See https://github.com/rust-lang/rust/issues/90794 | ||
unreachable!() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this issue was resolved:
rust-lang/rust#90794, But I just checked in godbolt and it does reduces optimizations
In
fn compute_rank(&self)
remove the if condition previously used to remove the bounds check.According to the issue linked (rust-lang/rust#90794) it has been fixed since 1.72.
Another condition below can be removed.