-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Edition 2024: don't special-case diverging blocks as much #123590
Conversation
6164863
to
6ccb844
Compare
6ccb844
to
c39e26c
Compare
I wonder about maybe special-casing keyword control flow here. Trying to push a "no, you can't write Given that we could change let Some(x) = foo else {
return;
}; everywhere, especially since we've been pushing people to that. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
🔔 This is now entering its final comment period, as per the review above. 🔔 |
I think I want to be the procedural wet blanket here and say that this is moving too fast and has too many unknowns to plausibly land in 2024. I don't remember seeing it on the concept tracking boards, and this is ultimately a very large style change to Rust. I'd prefer if this was targeting e2027 instead to have more time to get the lints correct and to let people experiment with the new style on nightly. |
Had someone reach out to me about a similar point to what @riking has brought up, and so I just want to clarify: The original FCP comment included this phrasing:
I want to be sure that this means that what is being approved here is general agreement that something along these lines is desirable, and experimentation in nightly. However, there still needs to be a follow-up FCP in the future for T-Lang to actually commit to all the details of how this will be landed in its final form for the new edition. I'm specifically not filing a concern here under the assumption that what I wrote above is correct. If it is not, then I do think we need to clarify the process of how the details will be worked out later as a blocker to landing this FCP. |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. This will be merged soon. |
Here is an artificial use case that can no longer work (and has no fix) if the change in this PR goes through. I don't know if any real-world use-cases look like this, but I think that the existence of this potential use case is concerning. macro_rules! my_match {
($result:expr, $ok_fn:expr, $err_fn:expr) => {
match $result {
Ok(x) => {
$ok_fn(x);
}
Err(e) => {
$err_fn(e);
}
}
};
}
fn diverge<T>(_x: T) -> ! {
panic!()
}
fn id<T>(x: T) -> T {
x
}
fn different_type_arms(res: Result<i32, String>) {
my_match!(res, id, id);
}
fn diverging_arms(res: Result<i32, String>) -> ! {
my_match!(res, diverge, diverge);
} The Are there real world use cases that have this problem? |
@theemathas I don't think this is a real problem, seems like a very niche use case to want to ignore all return values and diverge in different calls to the same macro. It's also trivial to change the macro to support specifically the diverging case... |
On the procedural question of what lang agreed to here, including the context from our meetings, the final FCP represents agreement to Option 2:
Since Rust 2024 is still in nightly, landing this PR represents landing this in nightly (and with a presumption that, barring surprises, this will become part of stable Rust 2024). Landing this in nightly Rust 2024 will allow testing this to be part of the testing and validation process for Rust 2024. Given that the FCP has completed, assuming this PR has been updated to be compliant with Option 2, it can be merged as far as lang is concerned. Switching hats, and putting on the one for edition management, to facilitate edition testing, we would prefer to see the machine-applicable migration lints land at approximately the same time as this PR does. |
@WaffleLapkin and I discussed the question of what set of macros in std/core should be updated to include |
@rustbot labels +I-lang-nominated We FCPed Option 2, which contained this language:
@workingjubilee has now identified that, while we can do this at the time of type checking, there's no way to change the expansion of macro_rules! my_panic {
() => { return std::process::abort() };
}
const _: () = {
my_panic!();
//~^ ERROR return statement outside of function body
}; (Great catch.) Let's renominate to discuss what we want to do. At a minimum, we would need to amend our Option 2 consensus to acknowledge that |
I should not be writing this at this hour, but I need to get it out of my system, so I can sleep soundly. I do not think this proposal, as currently understood and accepted by T-lang, is worth doing. As such, I'm closing this PR, as I do not want to work on something I do not believe in. I would however, want to work on other things (as I describe later) related to the core idea underlying this proposal (given enough rest, as those are not "edition-2024" time sensitive and I do need rest). Below are my reasoning why the "current T-lang proposal" is not worthwhile. For full context here is the current rules for typing blocks (as far as I know these is not documented anywhere, neither rust reference nor rust book nor ferrocene specification mention them in full) (this ignores
Now, let's outline what the "current T-lang proposal" is34:
Note that "a block" does not necessarily imply a literal block expression. It can be a part of another expression or statement kind, such as an // this currently compiles
let a: u8 = if true {
return;
} else {
return;
}; As a side-note: it is quite annoying to test what type does a block have, because just printing the return type of a closure wrapping the block does not work, since the current never type fallback behavior makes For comparison here is my original proposal:
Next, let's see my interpretation of why T-lang decided on the "current T-lang proposal"5. T-lang generally agreed that these "always diverges" rules do not feel like "Rust", in other words they are not consistent with the rest of the language (as rust generally does not use control-flow analysis for type checking6) and that the current rules can be confusing and surprising. Especially T-lang did not like // this currently compiles
let Some(id) = last.checked_add(1) else {
exhausted();
}; However, there are two main concerns. First one is "too much churn", especially given that Based on these concerns (primarily the churn one) T-lang decided to go with a "middle ground" proposal (aka "current T-lang proposal"). As was identified by @workingjubilee and mentioned by @traviscross, the proposal is not actually implementable -- But ignoring that, at first glance "current T-lang proposal" achieves some of the simplification, makes the way for future improvements, all while not doing too much breakage. I want to argue that this is not the case. Firstly, I would argue that "current T-lang proposal" does not actually do that much simplification. As you might notice by the description above, it is only a little bit "simpler", in that the reasoning is very local (you only need to check the last statement, rather than all of them, recursively), however it is still a mile behind "my original proposal". Moreover, I would say that it's less intuitive, as "current T-lang proposal" depends on a special-case ( Secondly, I do not think it makes way for the future improvements. The problem with "my original proposal" is that most of the breakage9 is Thirdly, I want to remind the reader, that any breakage has an inherent cost. I.e. the "cost" of a breaking change has both a component depending on the amount of breakage (say Lastly, I want to summarize this with a graph11: Yes, if we had a time machine, we would start the language in the green circle/"ideal state"/"my original proposal". Yes, blue square/"accepted proposal"/"current T-lang proposal" is technically better that the current state12. But it does not mean that going from where we are to "current T-lang proposal" is a good thing. If anything it burdens everyone with the constant cost of a breaking change without enough justification for it, causing us to forever have at least 2 different "bad" semantics for blocks. That is, I do not think that it on its own gives us a noticeably better language to justify the breakage. It only makes sense in the context of future bigger breakage, which, as established before13, it doesn't help to do. In conclusion: I do not think "current T-lang proposal" is a good direction for the language. It especially does not make sense for 2024 edition. If we want to get a more sane semantics for block typing, we need to find ways to get to the "ideal state" in one breaking change. Now, if we don't do anything, nothing will ever change and in the
Additionally I want to highlight that not doing anything is also fine. Yes, the current state is weird and somewhat annoying, but it is fine. The semantics are not actively bad, they are merely a bit surprising (and even that more in "why this compiles" and not "how does the code behave"). Finally I want to apologize if this comment sounds too angry, harsh, verbose, or negative. I'm very exhausted from talking and arguing about this, from thinking about everything described here (but I'm not mad at anyone who's participated in this discussion, this is an outcome of my actions and the system/reality as a whole, rather that fault of anyone in particular). I'm ashamed of wasting everyone's time on this proposal and then retracting it (even though I understand that such is life -- the only way to learn is by doing, and doing necessarily implies failing at least sometimes). So I've tried being as verbose and clear as possible, so I can remove this topic from my working memory and focus on other things. Additionally, as I'm writing this very line, it's 6:57 in my timezone and I haven't slept yet, but I couldn't sleep with all of this on my mind. With that, thanks everyone for reading this "comment on 'GitHub' social network" and participating with these proposals. [Loud "Close with comment" button click]. Footnotes
|
Dear Waffle,
Thanks for the work you put into this issue and this writeup in particular. I tend to agree with you that this change is more trouble than it's worth at this exact moment in time.
…On Mon, Jun 3, 2024, at 2:06 AM, Waffle Maybe wrote:
I should not be writing this at this hour, but I need to get it out of my system, so I can sleep soundly.
I do not think this proposal, as currently understood and accepted by T-lang, is worth doing. As such, *I'm closing this PR, as I do not want to work on something I do not believe in.* I *would* however, want to work on other things (as I describe later) related to the core idea underlying this proposal (given enough rest, as those are not "edition-2024" time sensitive and I do need rest).
Below are my reasoning why the "current T-lang proposal" is not worthwhile.
For full context here is the current rules for typing blocks (as far as I know these is not documented anywhere, neither rust reference <https://doc.rust-lang.org/reference/expressions/block-expr.html> nor rust book <https://doc.rust-lang.org/stable/book/ch03-05-control-flow.html?highlight=block#control-flow> nor ferrocene specification <https://spec.ferrocene.dev/expressions.html#block-expressions> mention them in full) (this ignores `break 'block_label`, because it behaves mostly like the tail expression and is largely irrelevant to the discussion at hand):
>
> If block has a tail expression, its type is the type of the tail expression.
> If the block always diverges (loosely defined as "all possible branches have an expression of type `!` somewhere in them"1 <#user-content-fn-1-f9268b6c3358f654b271d84c06052f5b>2 <#user-content-fn-2-f9268b6c3358f654b271d84c06052f5b>) its type is `!`.
> Otherwise its type is `()`.
>
Now, let's outline what the "current T-lang proposal" is3 <#user-content-fn-3-f9268b6c3358f654b271d84c06052f5b>4 <#user-content-fn-4-f9268b6c3358f654b271d84c06052f5b>:
> In Rust edition `{next}`, change the rule for typing blocks to the following:
>
>
> If block has a tail expression, it's type is the type of the tail expression.
> If the block's last statement is a semicolon-statement and its expression is `return`, `continue`, or `break` , its type is `!` (additionally `panic`-like macros expand to a `return` expression, and can fulfill this rule).
> Otherwise its type is `()`.
>
Note that "a block" does not necessarily imply a literal block expression. It can be a part of another expression or statement kind, such as an `if` (`let`-`else` is the other example that comes to mind):
// this currently compiles
let a: u8 = if true {
return;
} else {
return;
};
As a side-note: it is quite annoying to test what type does a block have, because just printing the return type of a closure wrapping the block does not work, since the current never type fallback behavior makes `!` coerce to `()`, so the experiment does not show anything. Instead, the way to check is to assert that the type is something other than `()` (I tend to use `u8` as one of the shortest stable type other than `()`).
For comparison here is my original proposal:
> In Rust edition `{next}`, change the rule for typing blocks to the following:
>
>
> If block has a tail expression, it's type is the type of the tail expression.
> Otherwise its type is `()`.
>
Next, let's see *my interpretation* of why T-lang decided on the "current T-lang proposal"5 <#user-content-fn-5-f9268b6c3358f654b271d84c06052f5b>. T-lang generally agreed that these "always diverges" rules do not feel like "Rust", in other words they are not consistent with the rest of the language (as rust generally does not use control-flow analysis for type checking6 <#user-content-fn-6-f9268b6c3358f654b271d84c06052f5b>) and that the current rules can be confusing and surprising. *Especially* T-lang did not like `let`-`else` examples with function calls:
// this currently compiles
let Some(id) = last.checked_add(1) else {
exhausted();
};
*However*, there are two main concerns. First one is "too much churn", especially given that `rustfmt` currently adds a `;` after `return`, which would be a semantic change with "my original proposal". Second one is that this would break "inline function at the end"7 <#user-content-fn-7-f9268b6c3358f654b271d84c06052f5b>.
Based on these concerns (primarily the churn one) T-lang decided to go with a "middle ground" proposal (aka "current T-lang proposal").
As was identified by @workingjubilee <https://github.com/workingjubilee> and mentioned <#123590 (comment)> by @traviscross <https://github.com/traviscross>, the proposal is not actually implementable -- `panic!` can't expand to a `return` because it can be used in `const`s, where `return` can't be used8 <#user-content-fn-8-f9268b6c3358f654b271d84c06052f5b>.
But ignoring that, at first glance "current T-lang proposal" achieves some of the simplification, makes the way for future improvements, all while not doing too much breakage. *I want to argue that this is not the case.*
Firstly, I would argue that "current T-lang proposal" does not actually do that much simplification. As you might notice by the description above, it is only a little bit "simpler", in that the reasoning is very *local* (you only need to check the last statement, rather than all of them, recursively), however it is still a mile behind "my original proposal". *Moreover*, I would say that it's *less intuitive*, as "current T-lang proposal" depends on a special-case (`return`/`break`/`continue`) rather than a more fundamental property ("always diverges").
Secondly, I do not think it makes way for the future improvements. The problem with "my original proposal" is that most of the breakage9 <#user-content-fn-9-f9268b6c3358f654b271d84c06052f5b> is `return;` and it's too much breakage. This problem does not go anywhere after accepting "current T-lang proposal". The difference in breakage is so big, that after fixing breakage from "current T-lang proposal", breakage from "my original proposal" stays mostly the same.
Thirdly, I want to remind the reader, that any breakage has an inherent cost. I.e. the "cost" of a breaking change has both a component depending on the amount of breakage (say `f(x)`, where `x` is the amount10 <#user-content-fn-10-f9268b6c3358f654b271d84c06052f5b>) *and* a constant component (say `C`). Where `C` represents the cost of implementing/maintaining/documenting/learning/keeping in mind the change. So breaking the same thing twice, can be worse than breaking it in a worse way once.
Lastly, I want to summarize this with a graph11 <#user-content-fn-11-f9268b6c3358f654b271d84c06052f5b>:
Untitled-2024-05-31-0807.png (view on web) <https://github.com/rust-lang/rust/assets/38225716/617cad6f-4dcd-49a1-905c-bb78c9d69d93>
Yes, if we had a time machine, we would start the language in the green circle/"ideal state"/"my original proposal". Yes, blue square/"accepted proposal"/"current T-lang proposal" is technically better that the current state12 <#user-content-fn-12-f9268b6c3358f654b271d84c06052f5b>.
But it does not mean that going from where we are to "current T-lang proposal" is a good thing. If anything it burdens everyone with the constant cost of a breaking change without enough justification for it, causing us to forever have at least 2 different "bad" semantics for blocks. That is, I do not think that it on *it's own* gives us a noticeably better language to justify the breakage. It only makes sense in the context of future bigger breakage, which, as established before13 <#user-content-fn-13-f9268b6c3358f654b271d84c06052f5b>, it doesn't help to do.
*In conclusion: I do not think "current T-lang proposal" is a good direction for the language.* It especially does not make sense for 2024 edition. If we want to get a more sane semantics for block typing, we need to find ways to get to the "ideal state" in one breaking change.
Now, if we don't do anything, nothing will ever change and in the `{next+1}` edition we'll have the same exact situation. This comment wouldn't be complete without things that we *can* do in my opinion. Here are some ideas of things that might help us change the status quo.
1. Changing rustfmt default to preserve `;` rather than add it on `return`s and similar expressions
• I think this is a good idea even if we don't end up doing anything else -- adding `;` *seems* like a semantic change, we shouldn't do it even if it's technically not a semantic change because of current edge cases
2. Documenting the current behavior
• At least in rust reference
• But ideally also in other places
3. An (allow-by-default) lint. This would allow two things
• An ability for crates to say "I don't want to depend on the weird block semantics"
• An ability for crates to port their code at their pace
4. Better machine-applicable-fix support
• What is currently implemented would cover most of the cases, but not some of the more edge-y ones (like having inline functions at the end or having dead code)
5. An opt-in to change the block behavior independently of an edition
• Not sure if it's actually a good thing, but...
• It would allow people to use the sane semantics without requiring us to break everyone who wants to switch to a new edition
6. Blog post(s) to notify people (at least the ones reading blog posts and places which will share information from them) that we are planning to do this in the first place
Additionally I want to highlight that *not doing anything is also fine*. Yes, the current state is weird and somewhat annoying, but it is *fine*. The semantics are not actively bad, they are merely a bit surprising (and even that more in "why this compiles" and *not* "how does the code behave").
Finally I want to apologize if this comment sounds too angry, harsh, verbose, or negative. I'm very exhausted from talking and arguing about this, from thinking about everything described here (but I'm not mad at anyone who's participated in this discussion, this is an outcome of my actions and the system/reality as a whole, rather that fault of anyone in particular). I'm ashamed of wasting everyone's time on this proposal and then retracting it (even though I understand that such is life -- the only way to learn is by *doing*, and doing necessarily implies failing at least sometimes). So I've tried being as verbose and clear as possible, so I can remove this topic from my working memory and focus on other things. Additionally, as I'm writing this very line, it's 6:57 in my timezone and I haven't slept yet, but I couldn't sleep with all of this on my mind.
With that, thanks everyone for reading this "comment on 'GitHub' social network" and participating with these proposals. [Loud "Close with comment" button click].
Footnotes
1. "all possible branches" is an important distinction, there are cases where a block does not contain any semicolon-statements which wrap an expression of type `!`, but the block does have a type `!`: example <https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=df0c159dbde23223b900c46de1a41387> ↩ <#user-content-fnref-1-f9268b6c3358f654b271d84c06052f5b>
2. The proper definition would define an "always diverges" property and then define rules of how it propagates for all expression and statement kinds ↩ <#user-content-fnref-2-f9268b6c3358f654b271d84c06052f5b>
3. I'm calling it "current" and "T-lang" to distinguish from previous or future proposals and from current ideas of other entities. This will of course become slightly ambiguous if T-lang changes its mind, but I did not figure out a better name ↩ <#user-content-fnref-3-f9268b6c3358f654b271d84c06052f5b>
4. I've removed the edition from this and other proposals, because it is not substance of the proposal. `{next}` signifies some future edition, which has not been published yet ↩ <#user-content-fnref-4-f9268b6c3358f654b271d84c06052f5b>
5. My interpretation is based on the two triage meetings where this was discussed, 2024-05-01 ***@***.***/rkF1tAkzC#Edition-2024-dont-special-case-diverging-blocks-rust123590> and, less importantly, 2024-05-08 ***@***.***/H1c47MYf0#Edition-2024-dont-special-case-diverging-blocks-rust123590> triage meetings ↩ <#user-content-fnref-5-f9268b6c3358f654b271d84c06052f5b>
6. Although it *is* used for borrow checking (which is to be expected, lifetimes are much more closely related to control flow) ↩ <#user-content-fnref-6-f9268b6c3358f654b271d84c06052f5b>
7. There was a proposal to solve this by allowing items after the trailing expression which I would describe (pardon my language, not trying to insult anyone) *completely bonkers*. This is the *opposite* of trying to be less confusing, as this makes "tail expression" not visually tail of the block. I really struggle with trying to comprehend why this is a good change ↩ <#user-content-fnref-7-f9268b6c3358f654b271d84c06052f5b>
8. And I'm not counting either "make `panic` expand to `return` or not depending on if it's in a `const`" or "make `panic_2024` which does not work in `const`s" as worthwhile workarounds to this. Additionally expanding to `return` feels *icky*, it *calls* for unexpected consequences ↩ <#user-content-fnref-8-f9268b6c3358f654b271d84c06052f5b>
9. I really want to say >95%, but I don't actually have the numbers, so this would be misleading ↩ <#user-content-fnref-9-f9268b6c3358f654b271d84c06052f5b>
10. I *want* to say that it should be linear, but is it? ↩ <#user-content-fnref-10-f9268b6c3358f654b271d84c06052f5b>
11. Small notes: the graph is normalized at the current state, i.e. it is "0", this does not mean that it's the worst possible thing. It's also very eye-bolly, obviously neither breakage nor goodness is an integer value, this just describes my feelings in a visual form ↩ <#user-content-fnref-11-f9268b6c3358f654b271d84c06052f5b>
12. Although this is not a definite thing, see my point above about "always diverges" being more more fundamental. To me these states feel about the same tbh. One is more surprising, while the other is more arbitrary, what a great choice ↩ <#user-content-fnref-12-f9268b6c3358f654b271d84c06052f5b>
13. As a reminder "my original proposal" requires so much more breakage, that the "current T-lang proposal" basically does not help at all. Changing 3 things in 2024 and then 20 things in 2027 is worse than changing 23 things in 2027 (numbers arbitrary) ↩ <#user-content-fnref-13-f9268b6c3358f654b271d84c06052f5b>
—
Reply to this email directly, view it on GitHub <#123590 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABF4ZVGGWN4L42IUCV2KL3ZFQBXTAVCNFSM6AAAAABF3HHKBOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBUGM2DQNBUGU>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Normally, when a block has no tail-expression its type is
()
:However, this is not the case when the block is known to diverge, in which case its type becomes
!
1:I think that this is a useless special case and unnecessary complicates the language. I propose that we remove it in the next edition.
Note that you can always fix the code that used this special case by removing a
;
(and removing dead-code, if there is any). We already have a machine-applicable fix (as can be seen in the test added in this PR).Also note that while this is related to the never type, this change is independent of the work on its stabilization. I personally think that if we are going to stabilize
!
, we should make it less weird and this is one of the ways to do that; but, this is not required for!
stabilization and is a completely separate cleanup.The only hard part of this change is that
rustfmt
currently adds;
toreturn
s which are at the end of blocks. We'll have to change rustfmt style in this regard in order to support the next edition. Out options are;
, but not add;
(this won't break any existing users);
in the edition<=2021 and remove;
in the edition>=2024 (this will automatically fix code when porting to the new edition) (does rustfmt even support editions?...);
(this will break all existing formatting, I don't think that's desirable)Tracking:
r? compiler-errors
Footnotes
it then immediately decays to
()
due to current never type fallback, but it does not matter -- you can still specify any type and the block will be coerced to that ↩