-
-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some Chinese sentences are detected as Japanese #84
Comments
Thanks. I don’t read, write, or speak Japanese or Chine so I can’t really help. PRs like with GH-77 are welcome! |
Hi @wooorm, @the-worldly-monkey From https://www.unicode.org/faq/han_cjk.html#4 (How can I recognize from the 32 bit value of a Unicode character if this is a Chinese, Korean or Japanese character?)
According to url, I will add some extra rules to |
@kewang PR would be great on this!! |
sentence 1
特別推薦的必訪店家「ヤマシロヤ」,雖然不在阿美橫町上,但就位於JR上野站廣小路口對面
sentence 2
特別推薦的必訪店家,雖然不在阿美橫町上,但就位於JR上野站廣小路口對面
Sentence 1 almost are Chinese characters and contains 5 Katakana characters. But its result is
jpn
incorrectly.Sentence 2 are Chinese characters fully, and its result is
cmn
correctly.Maybe the result is related to #77
The text was updated successfully, but these errors were encountered: