Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R: Break styler dependency for code formatting #2251

Open
jmcphers opened this issue Feb 13, 2024 · 7 comments
Open

R: Break styler dependency for code formatting #2251

jmcphers opened this issue Feb 13, 2024 · 7 comments
Labels
area: formatting Issues related to formatting and indentation area: kernels Issues related to Jupyter kernels and LSP servers lang: r support

Comments

@jmcphers
Copy link
Collaborator

jmcphers commented Feb 13, 2024

Currently, we rely on styler to format R code. This has been a big time-saver when building out the initial implementation since code formatting is complicated and styler does a great job! However, this is not ideal in the long run for a number of reasons:

  • Users who can't install R packages (a surprising number) will also not be able to format code. This is a core IDE feature that should not require the installation of a 3rd party package.
  • The styler package has its own opinionated rules for formatting code that don't necessarily match the ones built into the R language pack. This means that as-you-type formatting may fight/disagree with format-on-save formatting.
  • Requiring the R package makes code formatting an awkward process the first time, since we have to prompt you to install the package, wait for it to be installed, etc.
  • We can't use styler while R is busy doing something else, since it needs R itself to run.
  • The styler package has to fully parse the code to format it, so it cannot format any selections that aren't correct R code.
@jmcphers jmcphers added this to the Release Candidate milestone Feb 20, 2024
@wesm wesm added the lang: r label Feb 29, 2024
@lionel- lionel- added the area: kernels Issues related to Jupyter kernels and LSP servers label May 16, 2024
@lionel-
Copy link
Contributor

lionel- commented Jun 7, 2024

With posit-dev/ark#329, we now have a mechanism for indentation rules. They currently only deal with some edge cases that the frontend doesn't handle well. If we implemented a complete set of rules, which should be 1 or 2 weeks of work, we could use this as a rudimentary formatter. The formatter wouldn't do any other whitespace formatting such as line splitting but it would be fast, work on invalid code, be consistent with the as-you-type indenter, and could be use for format-on-save and format-on-paste.

Until we have a more complete formatter the line-splitting aspects can be achieved manually with codegrip (it would be nice to reimplement it in Ark).

@jmcphers
Copy link
Collaborator Author

Feedback from a beta user who is frustrated with styler because we can't use it while R is busy doing something else (another drawback). https://github.com/posit-dev/positron-beta/discussions/226

@lorenzwalthert
Copy link

lorenzwalthert commented Dec 19, 2024

Hi there, author of {styler} here. I am glad you found styler useful in building out the initial implementation @jmcphers. While we are just celebrating integration of {styler} into RStudio, I was wondering if {styler} has a place as a formatter in Positron too. Thanks for outlining the reasoning as to why a native code formatter might feature better trade-offs in the initial description of the issue. I don't know if relevant, but can give you my opinion on the outlined points.

Users who can't install R packages (a surprising number) will also not be able to format code. This is a core IDE feature that should not require the installation of a 3rd party package.

I don't have enough context why people can't seem to install R packages (autonomously), but I have a hard time to believe they rely on base packages only in their daily work. Maybe admins had to allow {styler} but I see that this adds complication. Another option would be to bundle {styler} with Positron. I am not an expert, but {styler} could be installed in a lib path upon Positron installation and then when formatting is requested, the .libPath() could be set before calling {styler} via Rscript or similar.

The styler package has its own opinionated rules for formatting code that don't necessarily match the ones built into the R language pack. This means that as-you-type formatting may fight/disagree with format-on-save formatting.

While we do have opinionated rules for formatting (the ones from tidyverse style guide), every formatter has, and we have made it very easy to control invasiveness or to build your own style guide with styler. I would assume a Positron R user would want to stick to the tidyverse style guide?

Requiring the R package makes code formatting an awkward process the first time, since we have to prompt you to install the package, wait for it to be installed, etc.

Again, bundling {styler} with Positron comes to mind, no idea if/how that complicates things.

We can't use styler while R is busy doing something else, since it needs R itself to run.

Can't it just run in a background session? Formatting with {styler} in RStudio is non-blocking.

The styler package has to fully parse the code to format it, so it cannot format any selections that aren't correct R code.

True. We briefly discussed if we should support that or not. Probably at least better error messages could be useful. If you build your own formatter and you want to get rid of this problem, maybe you can rely on some language-agnostic heuristic to identify parsable sub-sections of the code since if you want to solve that problem well, it may be complex.

Anyways, building {styler} was a lot of work and is not really complete, but you also have a lot of (FTE) resources and probably smarter ways of implementing a formatter, so maybe it's the better decision for Positron to start from scratch. I guess if you take the time to build a formatter, exposing a CLI or R API would be also nice, so the formatter can be used independently of the IDE. {styler} is also a pure R implementation, I guess you can leverage language agnostic tooling (to reduce complexity and get runtime speedups) for formatting if you don't require it to be pure R.

Having said that, I am excited to see where the Positron journey leads, no matter how code is styled here. I already tried it out and it seems to be a nice blend between RStudio and VS Code. Looking forward to version 1.0.

@juliasilge
Copy link
Contributor

Thank you so much for sharing your thoughts @lorenzwalthert! The styler package has had a big impact in the R community, and obviously on our work on Positron as well. I was the one who built the initial formatting support in #1686 and I had a great experience building tooling here on top of styler.

We do have plans underway for a different approach for formatting R code that does not depend on R itself, as we have decided that is a better choice in the medium/long term. Specifically, we do need an approach that is more performant than using R itself to format. If you are interested in where we are today, take a look at https://github.com/posit-dev/air, although be aware that it is early stage and is rough around the edges currently. There is in fact a CLI that can be used as an external formatter in RStudio, through the same new mechanism that makes RStudio's styler support possible. There is also a not-yet-released VS Code extension for use in Positron and VS Code.

I do want to thank you again for all the ways you have made formatting in R possible and have brought more modern practices to the R community. Please do let us know if you have further questions, or of course feedback if you use Positron!

@lorenzwalthert
Copy link

Thanks @juliasilge for the context. I am aware that every tool has their life cycle so I am happy to see an alternative to {styler} coming up that stands on a different foundation and can outgrow {styler} one day. We made great improvements to performance over the last 2 years, but obviously, R itself is only so fast (and our implementation probably also not the fastest). I am glad you found ways to build a fast formatter based on rust and offload some complexity to biome.

I don't see any reference to the tidyverse style guide. It seems you are not striving to implement it. May I ask why?

@DavisVaughan
Copy link
Contributor

Hi @lorenzwalthert! @lionel- and I are the ones working on the Rust based formatter. We planned on reaching out to you after the holiday break, but definitely before making any kind of public release announcement about the formatter. We'd still like to have a conversation after break if that is okay with you! We'd be happy to answer any questions and talk about our overarching motivations for this work (which also extend beyond a formatter). I just wanted to go ahead and say something now since we will all be OOO for the next few weeks.


Specifically regarding the tidyverse style guide, we've been mostly following it, with a few deviations here and there which may backpropagate into it. We mostly haven't referenced it anywhere because

  1. we don't have a ton of documentation yet
  2. it would be nice if the end product wasn't branded specifically "for the tidyverse". I've been spending some time thinking about data.table compatibility, for example

@lorenzwalthert
Copy link

@DavisVaughan thanks for the heads-up and your openness. Happy to talk to you after the break, just reach out to me over email (see styler Author field on CRAN) or ping me on GitHub. Looking forward to learning more next year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: formatting Issues related to formatting and indentation area: kernels Issues related to Jupyter kernels and LSP servers lang: r support
Projects
None yet
Development

No branches or pull requests

7 participants