Replies: 1 comment
-
Feel free to close this discussion as you can use |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Consider the repository: https://github.com/preetpalS/UTF-16-Code-Text
It has two identical C# source files, one encoded in UTF-8 and the other in UTF-16LE (Little-Endian).
The UTF-16LE file has its language mis-classified as Smalltalk but this can be fixed using .gitattributes so this has a workaround.
The other issue is that the code for UTF-16 encoded files is weighted differently (presumably since the encoding requires double the bytes to represent the same information). 66.8% is the weighting for the code in the file encoded in UTF-16LE and 33.2% is the weighting for the code in the file encoded in UTF-8.
If UTF-16 files were handled correctly, the code weighting would be correct for these files, and they would not be mis-classified.
Beta Was this translation helpful? Give feedback.
All reactions