-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: No (opsem) Magic Boxes #3712
base: master
Are you sure you want to change the base?
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
Clarify the constraint o the invariant in footnote Co-authored-by: Jacob Lifshay <[email protected]>
It feels odd that one of the clear options is left out: why not expose a Like, I definitely agree that I agree that whatever happens shouldn't be specific to the |
I, at least, fully expect us to eventually have some way of writing alignment-obeying raw pointers in Rust in some way. If nothing else, (Transmuting between EDIT: added a word to try to communicate that I wasn't expecting this RFC to include such a type. |
That's my hope for the future as well, but to avoid the RFC becoming too cluttered, I am refraining from defining such a type in this RFC. |
Is there a list of optimisations that depend on noalias being emitted for Box’es? |
The RFC seems pretty clear that noalias hasn't really provided many benefits compared to being an extra burden to uphold for implementers, but maybe it is worth seeing if there are any sources that can provide a bit more detail on that. |
text/3712-box-yesalias.md
Outdated
* A pointer with an address that is not well-aligned for `T` (or in the case of a DST, the `align_of_val_raw` of the value), or | ||
* A pointer with an address that offsetting that address (as though by `.wrapping_byte_offset`) by `size_of_val_raw` bytes would wrap arround the address space | ||
|
||
The [`alloc::boxed::Box<T>`] type shall be laid out as though a `repr(transparent)` struct containing a field of type `WellFormed<T>`. The behaviour of doing a typed copy as type [`alloc::boxed::Box<T>`] shall be the same as though a typed copy of the struct `#[repr(transparent)] struct Box<T>(WellFormed<T>);`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The urlo definition is probably good.
It's defined in the opsem, but I don't know if we have a very good written record of that other than spread arround zulip threads and github issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a fan of this. I think that people moving from Vec<T>
to Box<[T]>
having to deal with drastically-different soundness rules is a giant footgun, and getting rid of the special [ST]B behaviour here sounds good to me.
My general take: The two "endpoints" here are
From what I can tell, we current orient If I could go back in time, I think I would favor end-user abstractions and offer different types (e.g., |
The RFC would benefit from some attempt to quantify the impact on performance, though our lack of standardized runtime benchmarks makes that hard. |
@rust-lang/opsem: We were curious in our discussions, does this RFC represent an existing T-opsem consensus? |
It does not represent any FCP done by T-opsem, which is why I've included them here. The claims I make, including those about the impact on the operation semantics, are included in the request for comment and consensus.
I recall some perf PR's (using the default rustc-perf suite) being done to determine the impact, which showed negligible impact. I can probably pull them up at some point in the RFC's lifecycle.
|
|
||
(Note that we do not define this type in the public standard library interface, though an implementation of the standard library could define the type locally) | ||
|
||
The following are not valid values of type `WellFormed<T>`, and a typed copy that would produce such a value is undefined behaviour: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Reference has been adjusted a while ago to state validity invariants positively, i..e by listing what must be true, instead of listing what must not be false. IMO that's more understandable, and the RFC should be updated to also do that.
There are patterns of using a custom per-
It's "every LLVM optimization that looks at alias information". The question is how much that matters in practice, which is hard to evaluate.
As Connor said, not in any formal sense. Several opsem members have expressed repeatedly that they want to see My own position is that I love how this simplifies the model and Miri, I am just slightly concerned about this being an irreversible loss of optimization potential that we might regret later. Absence of evidence of optimization benefit is not evidence of absence. Our benchmarks likely just don't have a lot of functions that take Is there a way we can query the ecosystem for functions taking |
- While the easiest alternative is to do nothing and maintain the status quo, as mentioned this has suprisingly implications both for the operational semantics of Rust | ||
- Alternative 2: Introduce a new type `AlisableBox<T>` which has the same interface as `Box<T>` but lacks the opsem implications that `Box<T>` has. | ||
- This also does not help remove the impact on the opsem model that the current `Box<T>` has, though provides an ergonomically equivalent option for `unsafe` code. | ||
- Alternative 3: We maintain the behaviour only for the unparameterized `Box<T>` type using the `Global` allocator, and remove it for `Box<T,A>` (for any allocator other than `A`), allowing unsafe code to use `Box<T, CustomAllocator>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually the status quo, since rust-lang/rust#122018
Just to follow up on some of the discussion, it wasn't immediately clear to me that types similar to I would still love if there's more data showing the lack of returns on |
…C++ counterpointer `std::unique_ptr`, in the prior art section
FTR, speaking right now as one of the main developers of lccc, my opinion is that the best way to mitigate any loss of future optimization potential is to just be far more granular with
I mentioned |
And more than them just not having them, IIRC someone tried to implement |
would still love if there's more data showing the lack of returns on noalias optimisations, since it feels wrong that something with a lot of history and usage isn't helping that much
It helps for references. I suspect people added it for Box because "why not".
|
It's currently unclear whether the existence of a |
I don't understand either of these sentences, sorry.
|
I believe
I think people are and have been reading way too much into the decision to put The vibes were good at the time, but we know so much more now. |
# Motivation | ||
[motivation]: #motivation | ||
|
||
The current behaviour of [`alloc::boxed::Box<T>`] can be suprising, both to unsafe code, and to people working on the language or the compiler. In many respects, `Box<T>` is treated incredibly specially by `rustc` and by Rust, leading to ICEs or unsoundness arising from reasonable changes, such as the introduction of per-container `Allocator`s. | ||
|
||
In the past, the operational semantics team has considered many ad-hoc solutions to the problem, while maintaining special cases in the aliasing model (such as Weak Protectors) that only exist for `Box<T>`. | ||
For example, moving a `ManuallyDrop<Box<T>>` after calling `Drop` is immediate undefined behaviour (due to the `Box` no longer being dereferenceable) - <https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Moving.20.60ManuallyDrop.3CBox.3C_.3E.3E.60>, and the Active tag requirements for a `Box<T>` are unsound when combined with custom allocators <https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Is.20Box.20a.20language.20type.20or.20a.20library.20type.3F>. This wastes procedural time reviewing the proposals, and complicates the language by introducing special case rules that would otherwise be unnecessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposed RFC kind of handwaves over what the opsem complications are that are motivating this. And for such a consequential RFC, the motivation section is rather short overall.
It'd be helpful for this motivating background context to be inlined and explored more deeply here, and for it to go into what the status of the various solutions to these problems might be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g., it'd be good to include the substance of @RalfJung's comment here somewhere:
What happened is that when applying the Stacked Borrows rules for
Box
toVec
, a bunch of existing code becomes UB. But that's more a statement about Stacked Borrows than aboutnoalias
. With Tree Borrows, things are already better, but even Tree Borrows has more UB than LLVMnoalias
. I am not aware of a single real-world usage ofVec
that actually breaks thenoalias
rules.
text/3712-box-yesalias.md
Outdated
|
||
- Alternative 1: Status Quo | ||
- While the easiest alternative is to do nothing and maintain the status quo, as mentioned this has suprisingly implications both for the operational semantics of Rust | ||
- Alternative 2: Introduce a new type `AlisableBox<T>` which has the same interface as `Box<T>` but lacks the opsem implications that `Box<T>` has. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Alternative 2: Introduce a new type `AlisableBox<T>` which has the same interface as `Box<T>` but lacks the opsem implications that `Box<T>` has. | |
- Alternative 2: Introduce a new type `AliasableBox<T>` which has the same interface as `Box<T>` but lacks the opsem implications that `Box<T>` has. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we should mention somewhere, in favor of adding a separate type like AliasableBox
, that if we did that, we could then align other operations on that type to match in a way we can't do with Box
. E.g., making a reference from an AliasableBox
would be unsafe
.
Co-authored-by: Travis Cross <[email protected]>
@RalfJung Your comment about vec surprised me a bit. I want to check my understanding. I think that the reason that |
Yeah, but if the pointers inside boxes aren't unique, then how can we guarantee that the mutable references are unique? I'm sure I'm just misunderstanding what |
If I understand correctly: Yes, having aliasing |
Indeed - it is not the intent of the RFC to require that arbitrary code remain valid in the presence of alised Box values, such a thing would be a breaking change - the library contract of Box isn't changed and still must be upheld. The proposal simply is to remove some UB from the language, when you do certain operations with Boxes in unsafe code. Most collections act like this already, with Box being the outlier. It is possible, for example, to hold duplicate |
Yes to my knowledge the arena usecase is compatible with LLVM |
To give a simple example: let mut v: Vec<i32> = Vec::with_capacity(128);
v.push_within_capacity(0).unwrap();
let elem0: &mut i32 = unsafe { &mut *v.as_mut_ptr().add(0) };
*elem0 = 12; // write to elem0 so that reference becomes "Active" in TB
let mut v = v; // retag v
v.push_within_capacity(1).unwrap();
// Carefully *not* using `v[1]` as that would create a slice conflicting with `elem0`.
let elem1: &mut i32 = unsafe { &mut *v.as_mut_ptr().add(1) };
mem::swap(elem0, elem1); This is UB under Stacked Borrows and Tree Borrows semantics (if However, LLVM |
So, elaborating on |
In the past, the operational semantics team has considered many ad-hoc solutions to the problem, while maintaining special cases in the aliasing model (such as Weak Protectors) that only exist for `Box<T>`. | ||
For example, moving a `ManuallyDrop<Box<T>>` after calling `Drop` is immediate undefined behaviour (due to the `Box` no longer being dereferenceable) - <https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Moving.20.60ManuallyDrop.3CBox.3C_.3E.3E.60>, and the Active tag requirements for a `Box<T>` are unsound when combined with custom allocators <https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Is.20Box.20a.20language.20type.20or.20a.20library.20type.3F>. This wastes procedural time reviewing the proposals, and complicates the language by introducing special case rules that would otherwise be unnecessary. | ||
|
||
In the case of `ManuallyDrop<T>`, because `Box<T>` asserts aliasing validity on a typed copy, and is invalidated on drop, it introduces unique behaviour - `ManuallyDrop<Box<T>>` *cannot* be moved after calling `ManuallyDrop::drop` *even* to trusted code known not to access or double-drop the `Box`. No other type in the language or library has the same behaviour[^2], as primitive references do not have any behaviour on drop (let alone behaviour that includes invalidating themselves), and only `Box<T>`, references, and aggregates containing references are retagged on move. There are proposed solutions on the `ManuallyDrop<T>` type, such as treating specially by supressing retags on move, but this is a novel idea (as `ManuallyDrop<T>` asserts non-aliasing validity invariants on move), and it would interfere with retags of references without justification. The proposed complexity is only required because of `Box<T>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I read, that doesn't affect ManuallyDrop. It also only really needs to for Box, so this rfc would erase that use case entirely. MaybeDangling could be useful inside aggregates in the situation my footnote describes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I read, that doesn't affect ManuallyDrop.
It does though. It normatively states:
The
ManuallyDrop<T>
type internally wrapsT
in aMaybeDangling
.
And we documented this in:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if this RFC erases that use case, it should either state normatively that it is updating RFC 3336 to not do that or explain why such wrapping is still needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just found an example where ManuallyDrop
still needs MaybeDangling
: https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/unsafe.20fields.20RFC/near/480108011
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I'd expect code like that should use MaybeDangling
explicitly, on the field itself.
Global allocators involve some magic anyway that adds extra UB, so this could potentially be justified via that. |
* A pointer with an address that is not well-aligned for `T` (or in the case of a DST, the `align_of_val_raw` of the value), or | ||
* A pointer with an address that offsetting that address (as though by `.wrapping_byte_offset`) by `size_of_val_raw` bytes would wrap arround the address space | ||
|
||
The [`alloc::boxed::Box<T>`] type shall be laid out as though a `repr(transparent)` struct containing a field of type `WellFormed<T>`. The behaviour of doing a typed copy as type [`alloc::boxed::Box<T>`] shall be the same as though a typed copy of the struct `#[repr(transparent)] struct Box<T>(WellFormed<T>);`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stdlib docs promise it has the same layout and abi as a pointer, and also promises niche optimization.
https://doc.rust-lang.org/alloc/boxed/index.html#memory-layout
So long as T: Sized, a
Box<T>
is guaranteed to be represented as a single pointer and is also ABI-compatible with C pointers (i.e. the C typeT*
).
https://doc.rust-lang.org/core/primitive.fn.html#abi-compatibility
*const T
,*mut T
,&T
,&mut T
,Box<T>
(specifically, onlyBox<T, Global>
), andNonNull<T>
are all ABI-compatible with each other for all T. They are also ABI-compatible with each other for different T if they have the same metadata type (<T as Pointee>::Metadata
).
(The last sentence implies the abi, and thus layout, holds for T: !Sized
pointers as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay :)
Co-authored-by: Ralf Jung <[email protected]>
Just catching up with things. I wanted to share that for automatic differentiation (and likely GPU Kernels written in pure Rust) noalias is very relevant. At the llvm-dev meeting I've been presenting 5 benchmark that showed Performance Penalties in 4/5 Benchmarks ranging from 30% - 10x by removing noalias from the Rust code. Removing noalias from the C++ version of the benchmark also had large performance impacts (up to 4x), so it's not a pure Rust problem, and noalias is just generally very valuable for AutoDiff and GPU support. The only benchmark not suffering didn't had enough pointers/references for noalias to be valuable. For now I have to admit that I have written all benchmarks using slices or arrays and not box, but both autodiff and GPU offload work on llvm-ir, and there are logical reasons about why noalias has such a big performance impact. So I don't see any reason why the penalty for noalias should be smaller for box. I will try to get at least some of these Benchmarks ported to use a Box instead of slices to prove my point, but for now I can share the existing slice/array versions that I used for my talk: EnzymeAD/Enzyme#1797 That being said, both autodiff and gpu-offloading are of course only experimental project goals, and noalias on box is of course less relevant than noalias on say slices. However, I think especially efficient GPU support is valuable beyond the scientific Computing/HPC/ML crowd and of interest for a lot of people. It might still be worth dropping it if it causes too many compiler issues, I haven't read up on the whole discussion yet. But we should at least be aware then that it likely will have relevant performance impact for these targets. If desired, I can also reach out to some of the LLVM people involved in GPU programming, they surely can provide more numbers/examples on noalias usage there. Edit: Iirc Nikita also pointed out that people often pass Box behind a reference, which would keep noalias, so that might be enough for our cases (I will need to check the IR). |
The idea is that box doesn’t get passed around that much: you usually just pass a reference, and mutable references have (This is distinct from passing a box behind a reference / a reference to a box, which would have |
Agree, but in our case the existence of 2+ Box types inside of a function (with at least one write to it) would often force us to cache the data behind all Boxes for autodiff (instead of just the one we modified), so it's not only about passing it to other functions. Vectors also don't have noalias iirc so I think also any combination of 2+ vec/box types existing in one function could then cause unnecessary copies. I want to be clear that the average IR from Rust is still miles ahead of the average IR we get from C++ (since people rarely use restrict), but the difference would be that people manually could fix C++ by adding restrict, whereas Rust users wouldn't be able to add noalias to Rust's builtin types. It might still be worth to sacrifice performance in some niche cases for easier correctness, I'll try to write up in more detail under which cases lacking noalias would have an impact for the HPC/etc. community so people can decide if it's worth it. I just wanted to share this as an FYI, while I'm working out more details on it. |
Within a function you already don't get noalias, due to limitations of llvm and what code rustc emits. |
Are you talking about sret preventing noalias? Yes, we're following the work on the llvm side too, e.g. https://discourse.llvm.org/t/restrict-noalias-for-struct-members/78568, so we hope to increase the number of locations where llvm can use noalias info. I however don't have precise insights on how many of these llvm noalias improvements will later be useable by Rust and where we e.g. already allow unsafe code to alias and thus won't be able to add noalias. |
No, he's talking about the fact that syntactically in LLVM IR
This can't be happening today since, as Connor said, Would be good to know which parts of your comments are based on concrete data and which are speculative. |
That's a good point that I had not considered. Would be good to get some data on the GPU side here, too. |
They could delegate to an inner function taking (Like how autovectorization works better on |
Just to clarify, LLVM does have something called |
Yeah |
Summary
Currently, the operational semantics of the type
alloc::boxed::Box<T>
is in dispute, but the compiler adds llvmnoalias
to it. To support it, the current operational semantics models have the type use a special form of theUnique
(Stacked Borrows) orActive
(Tree Borrows) tag, which has aliasing implications, validity implications, and also presents some unique complications in the model and in improvements to the type (e.g. Custom Allocators). We propose that, for the purposes of the runtime semantics of Rust,Box
is treated as no more special than a user-defined smart pointer you can write today1. In particular, it is given similar behaviour on a typed copy to a raw pointer.Rendered
Footnotes
We maintain some trivial validity invariants (such as alignment and address space limits) that a user cannot define, but these invariants only depend upon the value of the
Box
itself, rather than on memory. ↩