-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Add unstable formatter #10971
Conversation
It might help to split some of these commits into their own PRs. For instance, the parentesized node stuff |
Probably also logically squashing commits would help too |
This pull request introduces 2 alerts when merging 6a0d258 into 0ddca4d - view on LGTM.com new alerts:
|
be044f5
to
1a6b24d
Compare
This pull request introduces 1 alert when merging 1a6b24d into 0ddca4d - view on LGTM.com new alerts:
|
What is missing to help put comments in the right places? |
It's complicated, I can pinpoint it to exactly one thing. A bit how the comment-readding heuristics work: Basically everything are
Then for every line, I check, whether there is a comment block above
and whether a comment is inline and after a line
And after the codeblock I check if there is a trailing code block.
This works as long as you do normal stuff and don't put comments in weird locations like this one: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/meson.build#L540 But at least it is the only comment remove in this big meson file |
So basically just more checks for comments is all that is missing? |
A bit simplified, but yes, that's correct. |
This pull request introduces 1 alert when merging 5522fe3 into 97ec20e - view on LGTM.com new alerts:
|
Does it make sense to add a CI pass that does the following:
As the formatter is platform-independent, it just would have to run on e.g. linux and e.g. only if the file formatter.py was changed |
This pull request introduces 1 alert when merging 4be7fae into 611bd4e - view on LGTM.com new alerts:
|
@annacrombie what are your thoughts here? |
I just checked and muon also deletes comments after a opening paren, e.g. a = ( # comment
1
) This is impossible handle with the strategy both muon and this PR use because you only have so many nodes to attach comments to, but you can comment anywhere inside a parenthesized expression. For example, this expression:
corresponds roughly to the following AST:
Which gives us 7 nodes to attach comments to. But it is entirely possible to write: ( # comment 1
# comment 2
# comment 3
a # comment 4
= # comment 5
# comment 6
( # comment 7
1 # comment 8
+ # comment 9
2 # comment 10
) # comment 11
# comment 12
) # comment 13 Which already greatly exceeds the amount of slots we have, and this example could technically have an unlimited amount of comments inside the () anyway. muon partially solves this problem for lists, by adding another special formatting node Unrelated, but @JCWasmx86, how would you feel about renaming some of the formatting options, e.g. |
This pull request introduces 2 alerts when merging a3be218 into 8dfa550 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
Thanks a lot for your thoughts.
As you can see from these huge meson-code bases (Assuming a lot of syntax is used), we only have a quite low error rate. For gstreamer, 36 of those are from one bug, the other 6 are more difficult to solve (*) Those were 42, but I fixed a bug, so only 6 comments are lost now.
I have a lot of special cases, too in order to improve the formatting output, but I use more some heuristics and a bit of praying. But interesting approach, I will look into it.
I would say we shouldn't optimize for the 0.001% of insane code. Sure you could consider it a bug, but at the end you probably invest a few hours/days to have a similar formatting for this one insane case, while using this time would have benefited a huge majority. I think those really-really-edge cases aren't important right now.
Yes, just go ahead, I will follow you :) |
And in addition to that, I did a few experiments regarding idempotency. After maximum 3 iterations every project i have tested did converge to one formatting, so it would be probably wise to do something like:
Sure it would cost a bit of performance, but I think that it is worth it |
This pull request introduces 1 alert when merging 4fa74cb into 8dfa550 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
This pull request introduces 1 alert when merging b81eb65 into 8ee4660 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
This pull request introduces 1 alert when merging 910c7c2 into 8ee4660 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
This pull request introduces 1 alert when merging f0ed332 into bfc8132 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
Codecov Report
@@ Coverage Diff @@
## master #10971 +/- ##
=======================================
Coverage 68.58% 68.58%
=======================================
Files 412 412
Lines 87861 87861
Branches 20728 20728
=======================================
Hits 60261 60261
Misses 23093 23093
Partials 4507 4507
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
This pull request introduces 1 alert when merging df4d72a into bfc8132 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
Add a script for automatically formatting a project, counting the number of lost comments and how long it takes to converge to one formatting, example results:
This shows that probably after five passes the vast majority of files won't change |
Currently only prints the amount of comments, but will print other stats in future (Speed, comment-readding-quote, etc.)
This allows readding all comments from systemd
for a selected number of projects
I think it would help to get this reviewed if the commit history was cleaned up a bit. |
Hey, I still need to fully implement the approach described here: #10971 (comment) But yes, I can clean up the history (Albeit I play with the thought to make just one/two commits of it ^^) |
@JCWasmx86 I had a similar idea. However, my approach has subtle differences with yours that I think could solve or simplify some of your problems: Therefore, if you have something like this:
when the printer encounters a comment, it prints it and breaks the line. since comments are added to the nodes by the parser, it should not be possible to have a comment in a wrong place. But since every line is represented by a node, there should be no places where comments are not possible (except between args (but we could allow a CommentNode as a pos arg) and between kwargs (not sure how to handle this...) My proof of concept is here: 72c3773 |
Interesting idea, albeit I don't know how well it would handle e.g. this file https://git.sr.ht/~lattis/muon/tree/master/item/tests/fmt/crazy_comments.meson But after a few more thoughts I doubt the sense of a second formatter. There is already muon - working perfectly- , so why have a second formatter, that while approximating the formatting of muon, will probably never reach equivalent output for each possible input? |
I see the formatter itself as only one possible application of having a complete representation of the meson.build file in AST. Another important application, imho, is to allow the rewriter to keep comments in modified nodes, and in a more general way, to allow writing script for generating and modifying meson.build files. |
I still think there is value in Meson having a formatter, but obviously don't waster your time on it if you don't see value. Until the entire world moves to muon, having to have 2 developer tools kind of sucks. |
Superseded by #12318 |
Opening this for eventual discussion.
This is a draft.
Things that are at least done partially:
format(format(code))
seems to give the same output asformat(code)
in around 90%Things that aren't good at the moment:
Important things before merging:
Important non-code things before merging:
And maybe setup a CI-workflow that downloads some big meson files (Like from mesa, GNOME-Builder) to check for these things: