Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. #366

Closed
DavyLandman opened this issue Mar 19, 2024 · 14 comments · Fixed by #367
Assignees
Labels
bug Something isn't working enhancement New feature or request java Pull requests that update Java code

Comments

@DavyLandman
Copy link
Member

DavyLandman commented Mar 19, 2024

Describe the bug

Rascal LSP is reporting the wrong kind of semantic token types in the legend. This makes it a hit and miss which ones are properly highlighted.

To Reproduce

Annotate a production with @Category{constant.numeric} and it doesn't get the colors of numeric contants of the theme, while string as category does work.

Analysis

I think the documentation on this feature has been improved, it used to be less clear what LSP expected, I've read some old GH issues discussing the closed aspect of these TokenTypes. The set of valid token type is limited to the enum mentioned in the LSP spec, which is a subset of the API in VS Code

Now why did it even work? We were passing sublime scopes like entity.name and variable.function. But VS Code is ignoring after the . (as the syntax is actually <tokentype>.<modifier> in VS Code, no LSP). So if the first name happened to be in the list (or we used something like string) it accidentally worked. But constant.numeric (or constant) doesn't exist as a TokenType, so it gets no highlighting applied. A user cannot just do @Category{number} instead, as rascal-lsp has a hardcoded list of token types, and number is not in there.

Solution

  1. replace the list with the enum form the LSP spec.
  2. consider adding a mapping of the old text-mate styles to the token types, so that existing languages keep working a reasonable manner.
@DavyLandman DavyLandman added the bug Something isn't working label Mar 19, 2024
@DavyLandman DavyLandman changed the title Rascal VS Code uses textmate token scopes where it should use SemanticTokenTypes for the parse tree highlighting. Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the parse tree highlighting. Mar 19, 2024
@DavyLandman DavyLandman changed the title Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the parse tree highlighting. Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. Mar 19, 2024
@DavyLandman DavyLandman added enhancement New feature or request java Pull requests that update Java code labels Mar 19, 2024
@jurgenvinju
Copy link
Member

jurgenvinju commented Mar 20, 2024

Ouch! Good analysis. How about this:

  • Just like with the Symbols for the outliners, we model the LSP API directly and completely:
    • We add a data TokenType = enum(TokenModifier modifiers...) | class((TokenModifier modifiers...) | .... ; (note the variable argument lists), but we also add a default token type if it's not there already.
    • Using this other "enum" for the modifiers data TokenModifier = static() | ...
  • We implement @tokenType=class(static()) on syntax rules similar to the way we used to implemented @category
  • We support @category="string" by mapping selected textmate categories (and the old Rascal-eclipse) categories to the above, but label the uses "deprecated".
  • We also implement data Tree(TokenType tokenType = default(), that follows the semantics of the category keyword field.

This way we fix the bug and also keep in line with the rest of the LanguageServer design, and we stay backward compatible for a while.

@DavyLandman
Copy link
Member Author

That is cool, but it does mean you have to import the util::LanguageServer module into your Syntax file, which is a dependency that we normally don't want (since we like to also be able to run our stuff without LSP dependencies).

I was currently working on this:

  • use VS Code's syntax of <token type>{"." <token modifier>}* but keep them strings (and point towards the LSP documentation in the documentation of the parser function)
  • map as much of the old rascal & textmate tokens towards the LSP token types

Pros:

  1. keep in the same style as before, without introducing extra dependencies
  2. Add support for modifiers
  3. Be a relative small patch to the existing code.

Cons:

  1. the list of strings is external, and even if we would make a copy, it would still introduce less than desired dependency edges in or grammar modules
  2. it's a string. So users can still mess up the token & modifiers, and we have no type-checking time support for these errors.

@jurgenvinju
Copy link
Member

Watch out... next to the standard types and modifiers, there is more:

Along with the standard types and modifiers, VS Code defines a mapping of types and modifiers to similar TextMate scopes. That's covered in the section Semantic Token Scope Map.

@DavyLandman
Copy link
Member Author

Watch out... next to the standard types and modifiers, there is more:

Along with the standard types and modifiers, VS Code defines a mapping of types and modifiers to similar TextMate scopes. That's covered in the section Semantic Token Scope Map.

Yes, but that is not LSP.

That's VS Code that is helping out map semantic token types to textmate scopes for the themes that do not have extensive semantic token support.

@jurgenvinju
Copy link
Member

Also Custom TextMate scope mappings can be added.

We could use these to support the old Rascal-eclipse tokenType names, such as "ambiguity", etc. They can be added to the json configuration file and do not have to be directly associated with any language.

@DavyLandman
Copy link
Member Author

DavyLandman commented Mar 20, 2024

Also Custom TextMate scope mappings can be added.

We could use these to support the old Rascal-eclipse tokenType names, such as "ambiguity", etc. They can be added to the json configuration file and do not have to be directly associated with any language.

Indeed, but again, this is a VS Code specific extension. LSP does not want us to do this. VS Code API of semantic tokens is not the same as LSP api of semantic tokens. By design they've limited the categories.

But I was also thinking, that yes, for the ambiguity nodes, we could try and sneak one in, that just won't work for other LSP clients.

DavyLandman added a commit that referenced this issue Mar 20, 2024
The original implementation was half working, due to an incorrect interpretation of the standard.

This commit fixes that, but also adds modifiers

Fixes #366
DavyLandman added a commit that referenced this issue Mar 20, 2024
The original implementation was half working, due to an incorrect interpretation of the standard.

This commit fixes that, but also adds modifiers

Fixes #366
@jurgenvinju
Copy link
Member

I see your points, but I want to work towards a more ideal situation.

Syntax highlighting is a core feature of the Rascal game, in Eclipse, in VScode but also elsewhere in other LSP instances or even on the commandline. I don't like the strings since they lead to the kind of bugs that we are now fixing here. Things happily seem to work, but then they don't. Our users have to find this out.

So I guess I am proposing an extension to ParseTree.rsc where we add highlighting to the core of the parse tree representation:

  • We stick to names that fit the Rascal world, like SyntaxCategory and SyntaxCategoryModifier but use the LSP set as the standard set of constructors.
  • We write the implementation in Java in such a way that user extensions of Category will work automatically, if they also added the same name to their own JSON files. "Ambiguity" will serve as an example extension.
  • Everybody who writes a syntax rule, automatically has ParseTree imported by the type-checker, so that problem will be solved.
  • We implement @category=str for backward compatibility and @category=SyntaxCategory for the new style.
  • We change/fix other uses of @category and .category (such as in HTML generators and ANSI pretty printers)
  • We roll this out in rascal, rascal-lsp and rascal-eclipse (later)

@jurgenvinju
Copy link
Member

jurgenvinju commented Mar 20, 2024

Note that tags are not checked by the type checker (yet). Right @PaulKlint ?

@PaulKlint
Copy link
Member

If you mean typechecker (instead of prettyprinter): I actually fixed that yesterday!

@jurgenvinju
Copy link
Member

type checker! duh.. where did that come from? Oh that's cool. That will help us find a few bugs!

@jurgenvinju
Copy link
Member

@DavyLandman but that means that @category needs a new name because we don't have union types in Rascal. So something like @scope for the new ones? tag SyntaxScope = syntax@scope and data Tree(SyntaxScope scope=default())

@DavyLandman
Copy link
Member Author

Ok, so I like your idea @jurgenvinju. But since we're running into this issue at our customers and rascal is in the middle of a lot of shakeup, how about we split the fix in 2 phase.

  1. Add support for the LSP native ones to the list of supported token types, so at least a user can say @Category=number and get the proper syntax highlighting (this is a much smaller change than what I proposed in Fix token types to use LSP token types #367 )
  2. In parallel properly implement your extension to Parsetree.

So apart from this scheduling proposal, I have a small comment on the design proposed: that the LSP categories are quite limited, they designed it to force people into a small set of options that can reliably be mapped to a theme, but things we would consider normal is missing. Like for example there are only classes, interface & structures. And a lot of things will have to map to keyword.

Also, I'm not sure if scope is the best name? It alludes to both scope in language semantics, which it's not, and to text-mate scopes, which they also are not (i.e no nesting etc).

@DavyLandman
Copy link
Member Author

DavyLandman commented Mar 20, 2024

Or alternative point 1: do fix the name mapping part, but drop the modifier support. So in essence: fix the bug of how category was interpreted, without adding new functionality.

I think that would be my preference. I've updated #367 with these changes.

@DavyLandman
Copy link
Member Author

DavyLandman commented Mar 25, 2024

After some offline discussion with Jurgen, we came to the conclusion that #367 fixes the current bug, while usethesource/rascal#1928 will give the proper support for this. Therefore we'll keep #367 "small"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request java Pull requests that update Java code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants