-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support mlb option "allowExtendedTextConsts true" #49
Comments
Supporting this won't be too bad, but will require changes in a few places. For the lexer, we'll need to skip over UTF8 characters in the function We'll need to update the implementation of |
By the way, what is the accepted standard practice these days for visually handling "characters" that are encoded as more than one UTF8 character? E.g., the flag emoji "🇺🇸" is actually two UTF8 characters ("🇺" followed by "🇸"). But of course, it is intended to be visually represented as a single character. My initial thought is that this is important for |
Thanks for the info! I am not very familiar with UTF8/Unicode, but I would suggest we at least fix the lexer to not produce an error when encountering a UTF8 character. I am not so familiar with the difference between semantic position and visual position, so I vote for whatever is easier to implement, which is probably UTF8 semantic position. |
It occurred to me that a simpler way to support this is to allow for UTF-8 bytes but not check for validity of a UTF-8 byte sequence. #74 implements this. By default, this is disabled. It can be enabled with Your example above should now be working. Let me know if you have any trouble! |
Currently, smlfmt will report an error on non-ascii input.
Example file:
Error message:
Expected behavior
Strings need to handle UTF8 non-ascii characters.
The text was updated successfully, but these errors were encountered: