-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Add #[loop_match]
for improved DFA codegen
#138780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add #[loop_match]
for improved DFA codegen
#138780
Conversation
Some changes occurred in match checking cc @Nadrieril Some changes occurred in compiler/rustc_passes/src/check_attr.rs Some changes occurred in cc @BoxyUwU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @folkertdev for putting up this PR. The big picture looks right, in terms of the behavior of the tests and how to approach the experiment in terms of starting with the attributes for thiis.
This is a first partial pass on the details.
@rustbot author
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed review!
I've fixed a bunch of the low-hanging fruit (e.g. in the tests). For the actual pattern matching logic, I have a branch with what I believe is a better solution that re-uses more existing pattern matching infra. We'll come back to that here once björn has had a chance to look at it.
Some changes occurred in exhaustiveness checking cc @Nadrieril Some changes occurred in match lowering cc @Nadrieril |
☔ The latest upstream changes (presumably #138974) made this pull request unmergeable. Please resolve the merge conflicts. |
368f722
to
a89dcbe
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
f294773
to
6fe6909
Compare
This comment has been minimized.
This comment has been minimized.
b3a87ed
to
7d88da4
Compare
We've done a bunch of work here, and I believe all of the earlier review comments have now been dealt with. @rustbot ready |
@rustbot author As a lang matter, this is looking reasonable to me in terms of a lang experiment. As an impl matter, this is starting to look not unreasonable to me, but I'd like for @Nadrieril to also have a look if he's able. r? @Nadrieril @Nadrieril: I still need to raise this in a lang meeting to confirm that everyone is happy to see the experiment here in light of earlier objections, so please don't merge this just yet. You can leave it back in my hands after you're happy with the impl. Also CC @oli-obk as this work is carrying over some |
Reminder, once the PR becomes ready for a review, use |
I continue to be skeptical of the loop-match phrasing. I don't think "no new syntax" is a good argument with "well except it adds 2 new attributes and only supports limited patterns and needs custom pattern lowering", because at that point it's essentially new syntax anyway. I'm still inclined to say that this should be working on block labels that can be jumped between, making those labelled blocks directly the states in the state machine. But none of that is experiment-blocking. I'm fine with an experiment moving forward so we can see more real examples of what you want to do with this, and less "well that's just |
@rustbot labels -I-lang-nominated +I-lang-radar We discussed this in our lang call today. We're approving this experiment. I'll continue to be the lang "champion". To emphasize what @scottmcm said above, which he also said in discussion, he'd like to see more motivating examples for why keeping the jump labels in the value space is important. As we go forward here and work to write design meeting documents and to revise the RFC, this is something that we'll want to keep in mind and be sure to address. |
Kinda, yeah. The crucial difference is that it doesn't need any changes to the parser and formatter, or any tooling like rust-analyzer. So the impact is much smaller.
Sure. In short, the main motivating example here is the zlib-rs decompression state machine. It is a streaming algorithm, meaning that it must be able to pause and save its state when it runs out of input, so it can resume again when more input arrives. Beyond the rust value language being a lot more expressive than the one for labels, the ability to easily store the state is the crucial reason to us for using enums, and not labels. The code is at https://github.com/bjorn3/zlib-rs/blob/loop_match_attr/zlib-rs/src/inflate.rs#L871, just search for
Right. With this PR more or less complete, maybe we should strategize a bit what good concrete next steps are? |
Does the feature not intend to go further than that? Again, I'm just saying "btw, here are my ideas for implementing the more complete version of this". Current state is ok for the experiment.
Will do, I had plans along these lines already :) No promises on the delay tho. |
We'll cover all types eventually, so we'll keep that in mind.
Sounds good! I'd be happy to do actual implementation work too, but in terms of design you'll have a much better idea of how the pieces do/should fit together. |
This comment has been minimized.
This comment has been minimized.
71bf446
to
516c380
Compare
☔ The latest upstream changes (presumably #123948) made this pull request unmergeable. Please resolve the merge conflicts. |
516c380
to
2ea5bc8
Compare
It had skipped my mind that I was the one assigned to this. I have reviewed and approve of the r? compiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops thought I submitted my reviews a couple weeks ago
[first @ last] | [first, .., last] => dcx | ||
.emit_fatal(LoopMatchBadStatements { span: first.span.to(last.span) }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this not error on:
#[loop_match]
loop {
struct A;
struct B;
state = 'a: match {
...
}
}
Even though attempting to support it down below. I think it'd make sense to just not support defining items here tbh. The full feature will have actual loop match
syntax so why go out of our way to support this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. We do support it below because it is useful to be able to define macros that capture the label of the labeled block. Here we've not needed it, so we're stricter than necessary but for now that should be OK.
hir::ExprKind::Break(dest, ref value) => { | ||
let is_const_continue = self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're writing const continue
as break
so that you dont have to go and change all of HIR to have special nodes for const continue and loop match with their own typing rules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, we minimize code duplication in this way, and also const continue
as syntax has not yet been agreed on.
/// A `#[loop_match] loop { state = 'blk: { match state { ... } } }` expression. | ||
LoopMatch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you have const_continue
but no const_loop_match
? is const loop match unnecessary for getting codegen improvements from the direct jumps with const continues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only const_continue
is needed (really because in rust you can't add more patterns to a match
at runtime): the "const" part of the continue just makes sure that we know statically which arm of the match to jump to. Nothing special is needed from the loop.
☔ The latest upstream changes (presumably #141883) made this pull request unmergeable. Please resolve the merge conflicts. |
244a754
to
5d84fca
Compare
This comment has been minimized.
This comment has been minimized.
@rustbot ready Or at least, I hope I've answered the questions? Otherwise I don't think there are any actions for me? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I had asked for in earlier reviews seems addressed, so this looks good to me.
☔ The latest upstream changes (presumably #142443) made this pull request unmergeable. Please resolve the merge conflicts. |
c4b97f4
to
b2c9fe2
Compare
Co-authored-by: Folkert de Vries <[email protected]>
b2c9fe2
to
a774736
Compare
tracking issue: #132306
project goal: rust-lang/rust-project-goals#258
This PR adds the
#[loop_match]
attribute, which aims to improve code generation for state machines. For some (very exciting) benchmarks, see rust-lang/rust-project-goals#258 (comment)Currently, a very restricted syntax pattern is accepted. We'd like to get feedback and merge this now before we go too far in a direction that others have concerns with.
current state
We accept code that looks like this
#[loop_match]
: normalcontinue
andbreak
continue to work#[const_continue]
is only allowed in loops annotated with#[loop_match]
future work
break
valuemaybe future work
continue 'label value
syntax, which#[const_continue]
could then use.State::Initial
)break
/continue
expressions that are not marked with#[const_continue]
r? @traviscross