-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alloca for Rust #1808
Alloca for Rust #1808
Changes from all commits
168857d
ba65ef3
9ad827c
0d8462f
ac6b65f
64a5da3
b09ffce
c83ce18
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
- Feature Name: alloca | ||
- Start Date: 2016-12-01 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Add variable-length arrays to the language. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
Some algorithms (e.g. sorting, regular expression search) need a one-time backing store for a number of elements only | ||
known at runtime. Reserving space on the heap always takes a performance hit, and the resulting deallocation can | ||
increase memory fragmentation, possibly slightly degrading allocation performance further down the road. | ||
|
||
If Rust included this zero-cost abstraction, more of these algorithms could run at full speed – and would be available | ||
on systems without an allocator, e.g. embedded, soft- or hard-real-time systems. The option of using a fixed slice up | ||
to a certain size and using a heap-allocated slice otherwise (as afforded by | ||
[SmallVec](https://crates.io/crates/smallvec)-like classes) has the drawback of decreasing memory locality if only a | ||
small part of the fixed-size allocation is used – and even those implementations could potentially benefit from the | ||
increased memory locality. | ||
|
||
As a (flawed) benchmark, consider the following C program: | ||
|
||
```C | ||
#include <stdlib.h> | ||
|
||
int main(int argc, char **argv) { | ||
int n = argc > 1 ? atoi(argv[0]) : 1; | ||
int x = 1; | ||
char foo[n]; | ||
foo[n - 1] = 1; | ||
} | ||
``` | ||
|
||
Running `time nice -n 20 ionice ./dynalloc 1` returns almost instantly (0.0001s), whereas using `time nice -n 20 ionice | ||
./dynalloc 200000` takes 0.033 seconds. As such, it appears that just by forcing the second write further away from the | ||
first slows down the program (this benchmark is actually completely unfair, because by reducing the process' priority, | ||
we invite the kernel to swap in a different process instead, which is very probably the major cause of the slowdown | ||
here). | ||
|
||
Still, even with the flaws in this benchmark, | ||
[The Myth of RAM](http://www.ilikebigbits.com/blog/2014/4/21/the-myth-of-ram-part-i) argues quite convincingly for the | ||
benefits of memory frugality. | ||
|
||
# Detailed design | ||
[design]: #detailed-design | ||
|
||
So far, the `[T]` type could not be constructed in valid Rust code. It will now represent compile-time unsized (also | ||
known as "variable-length") arrays. The syntax to construct them could simply be `[t; n]` where `t` is a valid value of | ||
the type (or `mem::uninitialized`) and `n` is an expression whose result is of type `usize`. Type ascription must be used | ||
to disambiguate cases where the type could either be `[T]` or `[T; n]` for some value of `n`. | ||
|
||
The AST for the unsized array will be simply `syntax::ast::ItemKind::Repeat(..)`, but removing the assumption that the | ||
second expression is a constant value. The same applies to `rustc::hir::Expr_::Repeat(..)`. | ||
|
||
Type inference must apply the sized type unless otherwise ascribed. We should implement traits like `IntoIterator` for | ||
unsized arrays, which may allow us to improve the ergonomics of arrays in general. | ||
|
||
Translating the MIR to LLVM bytecode will produce the corresponding `alloca` operation with the given type and number | ||
expression. It will also require alignment inherent to the type (which is done via a third argument). | ||
|
||
Because LLVM currently lacks the ability to insert stack probes, the safety of this feature cannot be guaranteed. It is | ||
thus advisable to keep this feature unstable until Rust has a working stack probe implementation. | ||
|
||
# How we teach this | ||
[teaching]: #how-we-teach-this | ||
|
||
We need to extend the book to cover the distinction between sized and unsized arrays and especially the cases where | ||
type ascription is required. Having good error messages in case of type error around the sizedness of arrays will also | ||
help people to learn the correct use of the feature. | ||
|
||
While stack probes remain unimplemented on some platforms, the documentation for this feature should warn of possible | ||
dire consequences of stack overflow. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
- Even more stack usage means the dreaded stack limit will probably be reached even sooner. Overflowing the stack space | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the flip side, this might reduce stack usage for users of |
||
leads to segfaults at best and undefined behavior at worst (at least until the aforementioned stack probes are in | ||
place). On unices, the stack can usually be extended at runtime, whereas on Windows main thread stack size is set at | ||
link time (default to 1MB). The `thread::Builder` API has a method to set the stack size for spawned threads, however. | ||
|
||
- With this functionality, trying to statically reason about stack usage, even in an approximate way, gains a new | ||
degree of complexity, as maximum stack depth now depends not only on control flow alone, which can sometimes be | ||
predictable, but also on arbitrary computations. It certainly won't be allowed in MISRA Rust, if such a thing ever | ||
happens to come into existence. | ||
|
||
- Adding this will increase implementation complexity and require support from possible alternative implementations / | ||
backends (e.g. MIRI, Cretonne, WebASM). However, as all of them have C frontend support, they'll need to implement such | ||
a feature anyway. | ||
|
||
# Alternatives | ||
[alternatives]: #alternatives | ||
|
||
- Do nothing. Rust works well without it (there's the issue mentioned in the "Motivation" section though). `SmallVec`s | ||
work well enough and have the added benefit of limiting stack usage. Except, no, they turn into hideous assembly that | ||
makes you wonder if using a `Vec` wouldn't have been the better option. | ||
|
||
- make the result's lifetime function-scope bound (which is what C's `alloca()` does). This is mingling two concerns | ||
together that should be handled separately. A `'fn` lifetime will be however suggested in a sibling RFC. | ||
|
||
- use a special macro or function to initialize the arrays. Both seem like hacks compared to the suggested syntax. | ||
|
||
- mark the use of unsized arrays as `unsafe` regardless of values given due to the potential stack overflowing problem. | ||
The author of this RFC does not deem this necessary if the feature gate is documented with a stern warning. | ||
|
||
- allow for some type inference with regards to sizedness. This is likely to lead to surprises when some value ends up | ||
unsized when a sized one was expected. | ||
|
||
- Copy the design from C `alloca()`, possibly wrapping it later. This doesn't work in Rust because the returned | ||
slice could leave the scope, giving rise to unsoundness. | ||
|
||
- Use escape analysis to determine which allocations could be moved to the stack. This could potentially benefit even | ||
more programs, because they would benefit from increased allocation speed without the need for change. The deal-breaker | ||
here is that we would also lose control to avoid the listed drawback, making programs crash without recourse. Also the | ||
compiler would become somewhat more complex (though a simple incomplete escape analysis implementation is already in | ||
[clippy](https://github.com/Manishearth/rust-clippy). | ||
|
||
# Unresolved questions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. C's There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding is that that limitation exists primarily to allow naive single pass compilers to exist (along with some "interesting" ways of implementing |
||
[unresolved]: #unresolved-questions | ||
|
||
- does the MIR need to distinguish between arrays of statically-known size and unsized arrays (apart from the type | ||
information)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @eddyb's point was important that in a
[val; n]
literal,n
should only be allowed to be a runtime (non-constant) expression when the expected type of the literal is known to be[T]
(rather than known to be[T; n]
, or not known).That is:
In other words, you need to explicitly request it.
The alternative of giving the expression a different inferred type depending on whether rustc can see that the number-expression is a constant expression seems fickle and prone to surprises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point; I'll change the wording accordingly.