Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. #366

DavyLandman · 2024-03-19T13:39:19Z

Describe the bug

Rascal LSP is reporting the wrong kind of semantic token types in the legend. This makes it a hit and miss which ones are properly highlighted.

To Reproduce

Annotate a production with @Category{constant.numeric} and it doesn't get the colors of numeric contants of the theme, while string as category does work.

Analysis

I think the documentation on this feature has been improved, it used to be less clear what LSP expected, I've read some old GH issues discussing the closed aspect of these TokenTypes. The set of valid token type is limited to the enum mentioned in the LSP spec, which is a subset of the API in VS Code

Now why did it even work? We were passing sublime scopes like entity.name and variable.function. But VS Code is ignoring after the . (as the syntax is actually <tokentype>.<modifier> in VS Code, no LSP). So if the first name happened to be in the list (or we used something like string) it accidentally worked. But constant.numeric (or constant) doesn't exist as a TokenType, so it gets no highlighting applied. A user cannot just do @Category{number} instead, as rascal-lsp has a hardcoded list of token types, and number is not in there.

Solution

replace the list with the enum form the LSP spec.
consider adding a mapping of the old text-mate styles to the token types, so that existing languages keep working a reasonable manner.

The text was updated successfully, but these errors were encountered:

jurgenvinju · 2024-03-20T09:02:56Z

Ouch! Good analysis. How about this:

Just like with the Symbols for the outliners, we model the LSP API directly and completely:
- We add a data TokenType = enum(TokenModifier modifiers...) | class((TokenModifier modifiers...) | .... ; (note the variable argument lists), but we also add a default token type if it's not there already.
- Using this other "enum" for the modifiers data TokenModifier = static() | ...
We implement @tokenType=class(static()) on syntax rules similar to the way we used to implemented @category
We support @category="string" by mapping selected textmate categories (and the old Rascal-eclipse) categories to the above, but label the uses "deprecated".
We also implement data Tree(TokenType tokenType = default(), that follows the semantics of the category keyword field.

This way we fix the bug and also keep in line with the rest of the LanguageServer design, and we stay backward compatible for a while.

DavyLandman · 2024-03-20T09:09:57Z

That is cool, but it does mean you have to import the util::LanguageServer module into your Syntax file, which is a dependency that we normally don't want (since we like to also be able to run our stuff without LSP dependencies).

I was currently working on this:

use VS Code's syntax of <token type>{"." <token modifier>}* but keep them strings (and point towards the LSP documentation in the documentation of the parser function)
map as much of the old rascal & textmate tokens towards the LSP token types

Pros:

keep in the same style as before, without introducing extra dependencies
Add support for modifiers
Be a relative small patch to the existing code.

Cons:

the list of strings is external, and even if we would make a copy, it would still introduce less than desired dependency edges in or grammar modules
it's a string. So users can still mess up the token & modifiers, and we have no type-checking time support for these errors.

jurgenvinju · 2024-03-20T09:10:08Z

Watch out... next to the standard types and modifiers, there is more:

Along with the standard types and modifiers, VS Code defines a mapping of types and modifiers to similar TextMate scopes. That's covered in the section Semantic Token Scope Map.

DavyLandman · 2024-03-20T09:11:23Z

Watch out... next to the standard types and modifiers, there is more:

Along with the standard types and modifiers, VS Code defines a mapping of types and modifiers to similar TextMate scopes. That's covered in the section Semantic Token Scope Map.

Yes, but that is not LSP.

That's VS Code that is helping out map semantic token types to textmate scopes for the themes that do not have extensive semantic token support.

jurgenvinju · 2024-03-20T09:13:01Z

Also Custom TextMate scope mappings can be added.

We could use these to support the old Rascal-eclipse tokenType names, such as "ambiguity", etc. They can be added to the json configuration file and do not have to be directly associated with any language.

DavyLandman · 2024-03-20T09:14:21Z

Also Custom TextMate scope mappings can be added.

We could use these to support the old Rascal-eclipse tokenType names, such as "ambiguity", etc. They can be added to the json configuration file and do not have to be directly associated with any language.

Indeed, but again, this is a VS Code specific extension. LSP does not want us to do this. VS Code API of semantic tokens is not the same as LSP api of semantic tokens. By design they've limited the categories.

But I was also thinking, that yes, for the ambiguity nodes, we could try and sneak one in, that just won't work for other LSP clients.

The original implementation was half working, due to an incorrect interpretation of the standard. This commit fixes that, but also adds modifiers Fixes #366

jurgenvinju · 2024-03-20T11:35:26Z

I see your points, but I want to work towards a more ideal situation.

Syntax highlighting is a core feature of the Rascal game, in Eclipse, in VScode but also elsewhere in other LSP instances or even on the commandline. I don't like the strings since they lead to the kind of bugs that we are now fixing here. Things happily seem to work, but then they don't. Our users have to find this out.

So I guess I am proposing an extension to ParseTree.rsc where we add highlighting to the core of the parse tree representation:

We stick to names that fit the Rascal world, like SyntaxCategory and SyntaxCategoryModifier but use the LSP set as the standard set of constructors.
We write the implementation in Java in such a way that user extensions of Category will work automatically, if they also added the same name to their own JSON files. "Ambiguity" will serve as an example extension.
Everybody who writes a syntax rule, automatically has ParseTree imported by the type-checker, so that problem will be solved.
We implement @category=str for backward compatibility and @category=SyntaxCategory for the new style.
We change/fix other uses of @category and .category (such as in HTML generators and ANSI pretty printers)
We roll this out in rascal, rascal-lsp and rascal-eclipse (later)

jurgenvinju · 2024-03-20T11:35:58Z

Note that tags are not checked by the type checker (yet). Right @PaulKlint ?

PaulKlint · 2024-03-20T11:39:18Z

If you mean typechecker (instead of prettyprinter): I actually fixed that yesterday!

jurgenvinju · 2024-03-20T11:40:55Z

type checker! duh.. where did that come from? Oh that's cool. That will help us find a few bugs!

jurgenvinju · 2024-03-20T11:43:20Z

@DavyLandman but that means that @category needs a new name because we don't have union types in Rascal. So something like @scope for the new ones? tag SyntaxScope = syntax@scope and data Tree(SyntaxScope scope=default())

DavyLandman · 2024-03-20T12:11:30Z

Ok, so I like your idea @jurgenvinju. But since we're running into this issue at our customers and rascal is in the middle of a lot of shakeup, how about we split the fix in 2 phase.

Add support for the LSP native ones to the list of supported token types, so at least a user can say @Category=number and get the proper syntax highlighting (this is a much smaller change than what I proposed in Fix token types to use LSP token types #367 )
In parallel properly implement your extension to Parsetree.

So apart from this scheduling proposal, I have a small comment on the design proposed: that the LSP categories are quite limited, they designed it to force people into a small set of options that can reliably be mapped to a theme, but things we would consider normal is missing. Like for example there are only classes, interface & structures. And a lot of things will have to map to keyword.

Also, I'm not sure if scope is the best name? It alludes to both scope in language semantics, which it's not, and to text-mate scopes, which they also are not (i.e no nesting etc).

DavyLandman · 2024-03-20T12:16:14Z

Or alternative point 1: do fix the name mapping part, but drop the modifier support. So in essence: fix the bug of how category was interpreted, without adding new functionality.

I think that would be my preference. I've updated #367 with these changes.

DavyLandman · 2024-03-25T19:00:37Z

After some offline discussion with Jurgen, we came to the conclusion that #367 fixes the current bug, while usethesource/rascal#1928 will give the proper support for this. Therefore we'll keep #367 "small"

DavyLandman added the bug Something isn't working label Mar 19, 2024

DavyLandman changed the title ~~Rascal VS Code uses textmate token scopes where it should use SemanticTokenTypes for the parse tree highlighting.~~ Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the parse tree highlighting. Mar 19, 2024

DavyLandman changed the title ~~Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the parse tree highlighting.~~ Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. Mar 19, 2024

DavyLandman assigned DavyLandman and jurgenvinju Mar 19, 2024

DavyLandman added enhancement New feature or request java Pull requests that update Java code labels Mar 19, 2024

DavyLandman added a commit that referenced this issue Mar 20, 2024

Support LSP token types and modifiers

a31f5ae

The original implementation was half working, due to an incorrect interpretation of the standard. This commit fixes that, but also adds modifiers Fixes #366

DavyLandman added a commit that referenced this issue Mar 20, 2024

Support LSP token types and modifiers

4b2a094

The original implementation was half working, due to an incorrect interpretation of the standard. This commit fixes that, but also adds modifiers Fixes #366

DavyLandman mentioned this issue Mar 20, 2024

Fix token types to use LSP token types #367

Merged

2 tasks

DavyLandman mentioned this issue Mar 22, 2024

Introduce a fixed set of syntax highlighting categories to replace the old string based categories in parse trees usethesource/rascal#1928

Open

DavyLandman closed this as completed in #367 Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. #366

Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. #366

DavyLandman commented Mar 19, 2024 •

edited

Loading

jurgenvinju commented Mar 20, 2024 •

edited

Loading

DavyLandman commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024

DavyLandman commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024

DavyLandman commented Mar 20, 2024 •

edited

Loading

jurgenvinju commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024 •

edited

Loading

PaulKlint commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024

DavyLandman commented Mar 20, 2024

DavyLandman commented Mar 20, 2024 •

edited

Loading

DavyLandman commented Mar 25, 2024 •

edited

Loading

Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. #366

Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. #366

Comments

DavyLandman commented Mar 19, 2024 • edited Loading

jurgenvinju commented Mar 20, 2024 • edited Loading

DavyLandman commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024

DavyLandman commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024

DavyLandman commented Mar 20, 2024 • edited Loading

jurgenvinju commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024 • edited Loading

PaulKlint commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024

jurgenvinju commented Mar 20, 2024

DavyLandman commented Mar 20, 2024

DavyLandman commented Mar 20, 2024 • edited Loading

DavyLandman commented Mar 25, 2024 • edited Loading

DavyLandman commented Mar 19, 2024 •

edited

Loading

jurgenvinju commented Mar 20, 2024 •

edited

Loading

DavyLandman commented Mar 20, 2024 •

edited

Loading

jurgenvinju commented Mar 20, 2024 •

edited

Loading

DavyLandman commented Mar 20, 2024 •

edited

Loading

DavyLandman commented Mar 25, 2024 •

edited

Loading