-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for vec_into_raw_parts
#65816
Comments
Should the returned pointer be a |
Add {String,Vec}::into_raw_parts Aspects to address: - [x] Create a tracking issue - rust-lang#65816
Add {String,Vec}::into_raw_parts Aspects to address: - [x] Create a tracking issue - rust-lang#65816
Should these functions be associated functions like |
Here's an example where this might have helped with avoiding UB: confio/go-rust-demo#1 (comment). There, the vector was destructed manually and |
Updated tracking issue number Added safeguards for transmute_vec potentially being factored out elsewhere Clarified comment about avoiding mem::forget Removed unneeded unstable guard Added back a stability annotation for CI Minor documentation improvements Thanks to @Centril's code review Co-Authored-By: Mazdak Farrokhzad <[email protected]> Improved layout checks, type annotations and removed unaccurate comment Removed unnecessary check on array layout Adapt the stability annotation to the new 1.41 milestone Co-Authored-By: Mazdak Farrokhzad <[email protected]> Simplify the implementation. Use `Vec::into_raw_parts` instead of a manual implementation of `Vec::transmute`. If `Vec::into_raw_parts` uses `NonNull` instead, then the code here will need to be adjusted to take it into account (issue rust-lang#65816) Reduce the whitespace of safety comments
If we view this as being the opposite of
|
I think it’s not at all obvious that deliberately making this method "weird" is a good thing. |
The method is weird either way. "This uses the vec by |
I don’t see how that makes it weird. |
But into_boxed_slice doesn't leak the memory |
Well,the documentation does a great job telling you how to avoid a leak and also leaking it is not unsafe. |
I don't disagree with any of that, but also my opinion is unchanged. It's not a big deal to have this as associated function vs method, but But again it's a totally unimportant difference either way, and we should just stabilize either form and get on with the day. |
In my opinion functions that start with |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@rust-lang/libs Any thoughts on the unresolved questions in the issue description? I’d be inclined to say:
|
@SimonSapin SGTM. The missing parallelism between definitions of |
The tuple order is the same as the arguments to rust/library/alloc/src/vec/mod.rs Lines 400 to 403 in f03ce30
rust/library/alloc/src/raw_vec.rs Lines 52 to 56 in f03ce30
|
specifically, they are given in the order that |
I agree but that a circular argument,
For example while saying this method could have help, I also see case where people will easily swap cap and len. I think a named field help to "more likely to do the right thing". I don't like much tuple in public API.
That a nice mnemonic. Fun fact the doc of vec say "Most fundamentally, Vec is and always will be a (pointer, capacity, length) triplet."... To avocate more my point this struct could be used in all String::into_raw_parts, removing the need of Vec::into_raw_parts_with_alloc (we could add alloc latter no ?): struct VecRawParts<T, A = Global> {
pub ptr: *mut T,
pub len: usize,
pub cap: usize,
pub alloc: A,
} |
@Stargateur this is precisely what the |
I actually miss it but that doesn't change my point, I don't want to include a dep for something that trivial and that require to know this crate exist. I think we need it in std not elsewhere. We can already have a "into_raw_parts" using |
I don't think that special types are required, but I do think that changing the first line of the docs would help immensely:
|
If all adding a special type does is prevent confusing len with capacity, I don't think it's worth it. It'd set a bad precedent for the stdlib to opt into these safety types that don't really offer all that much with regards to actual safety. Confusing len and capacity at a call-site will eventually happen but not nearly as often as developers simply getting the values wrong entirely. |
How that a bad precedent ? there is I think zero method of std that return a tuple of 3 elements. Then Rust is about structural programing, use a struct is THE way to go for Rust. If you want you can check https://doc.rust-lang.org/std/iter/index.html#structs that have like 30 structs. If you want a struct made for both perf and nice user API there is https://doc.rust-lang.org/std/collections/hash_map/enum.Entry.html or https://doc.rust-lang.org/std/fs/struct.OpenOptions.html. If it's not acceptable to have it I would more be happy with a fonction call let len = vec.len();
let cap = vec.capacity();
let alloc = vec.allocator();
let ptr = vec.into_ptr(); But at least that remove the problem of tuple and remove the problem of duplicate api for allocation feature. |
Confusing |
Not quite; the slice API has a few of them:
To be fair, none of those methods have the same length vs. capacity confusion.
Most of these are opaque types designed only to carry trait impls such as |
@LegionMammal978 Fair would be to say it's just 2, variations doesn't count,
My first point was to say add a struct is common. I don't understand the rest of your point look like HS.
The crate raw-parts clearly show advantage, have a "Raw builder" API for Vec allow to help user avoid mistake using from_vec, better this allow to add a nice into_vec that also remove the potentially mistake of This mean a user can get the struct, mutate only what needed, and use it to reconstruct the vec. That a BIG plus when the feature is mean to use unsafe to have every small help you can. |
Hi everyone, just wanted to ask what the status on this issue was, and if the agreed-upon suggestion of this thread was to use the raw-parts crate for stable rust toolchains. |
Seems to work. Don't know what the tradeoffs are. |
In stable Rust, you can safely decompose a use std::mem::ManuallyDrop;
pub fn into_raw_parts<T>(vec: Vec<T>) -> (*mut T, usize, usize) {
let mut vec = ManuallyDrop::new(vec);
let length = vec.len();
let capacity = vec.capacity();
(vec.as_mut_ptr(), length, capacity)
} The main downside (aside from verbosity) is that you have to be very careful: after calling Removing the need for this incantation is the purpose of the proposed |
I'm sorry, what? UB? Can't touch? Discussing such things as UB or not-UB is out of scope for the issue. |
|
That
The signature of |
I don’t quite see how what you wrote invalidates what I wrote. So long as you don’t modify vector after taking the pointer, you can do pretty much whatever you want with the vector and the pointer remains valid. And to put it bluntly, if the following: let mut vec = vec![1];
let ptr = vec.as_mut_ptr();
let len = vec.size();
core::mem::forget(vec);
println!("{}", unsafe { *ptr }); is undefined behaviour than that’s the language defect that needs to be addressed regardless of |
Yes, I agree. The simple solution is for the stdlib to prevent the returned pointer from carrying However, this is outside the scope of the issue because Fwiw, Legion is likely alluding to https://llvm.org/docs/LangRef.html#noalias when applied to function return types. Yielding a noalias raw pointer from But again, this isn't what's currently happening today. And it also doesn't really matter for this particular API because the usage is largely the same. |
This is false regardless of fn main() {
let mut vec = vec![1];
let ptr = vec.as_mut_ptr();
let slice = vec.as_mut_slice();
println!("{}", unsafe { *ptr });
let _slice = slice; // UB
} Forming and reborrowing references is always a potentially dangerous operation under Stacked Borrows, when they are mixed with raw pointers.
Then take it up with the UCG WG in rust-lang/unsafe-code-guidelines#326 or on Zulip.
Sure, it's just that someone was asking how to perform the equivalent of |
This is UB because it creates two mutable references to the same object: one through
Which doesn’t change the fact that after calling
Well, Miri doesn’t complain about the code so I don’t think I need to bring it up since the code appears to be sound. |
That's because the current implementation does not, in fact, implement In fact, PR #94421 experimented with making this exact change; unfortunately, it cannot be demonstrated directly, since Miri at the time of the PR did not yet support recursively retagging references in fn main() {
let mut boxed_slice = vec![1].into_boxed_slice();
let ptr = boxed_slice.as_mut_ptr();
std::mem::forget(boxed_slice);
println!("{}", unsafe { *ptr }); // UB
} error: Undefined Behavior: attempting a read access using <3158> at alloc1501[0x0], but that tag does not exist in the borrow stack for this location
--> src/main.rs:5:29
|
5 | println!("{}", unsafe { *ptr }); // UB
| ^^^^
| |
| attempting a read access using <3158> at alloc1501[0x0], but that tag does not exist in the borrow stack for this location
| this error occurs as part of an access at alloc1501[0x0..0x4]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <3158> was created by a SharedReadWrite retag at offsets [0x0..0x4]
--> src/main.rs:3:15
|
3 | let ptr = boxed_slice.as_mut_ptr();
| ^^^^^^^^^^^^^^^^^^^^^^^^
help: <3158> was later invalidated at offsets [0x0..0x4] by a Unique retag
--> src/main.rs:4:22
|
4 | std::mem::forget(boxed_slice);
| ^^^^^^^^^^^
= note: BACKTRACE (of the first span):
= note: inside `main` at src/main.rs:5:29: 5:33
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace |
I stand corrected (and baffled at the same time). Thanks. |
The clear solution is for Vec to never be noalias because otherwise it's terrible. But this is the tracking issue for The simple solution is this, if you genuinely have trouble remembering it's length then capacity, just write your own helper. There's no need to pollute the stdlib API itself because a caller can't remember off-hand that it's len-cap, not cap-len. Now, let's stabilize the feature already. |
The discussion above is an argument for stabilizing |
Whatever it takes to get it stabilized. Now, I need to help the unsafe WG realize they shouldn't ruin Vec like Box. |
No one has ever mentioned That being said, I think it is better to return a non-null pointer in this case. It's just that this should be a |
Why would covariance be an issue here? The returned pointer is unique and represents ownership, in which case covariance should be fine, maybe even desirable. (with the very limited understanding I have on this topic, so take this with a grain of salt) And I think this is one of the (two) usecases |
#65705 adds:
Things to evaluate before stabilization
NonNull<* mut T>
?The text was updated successfully, but these errors were encountered: