-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scripts (like Cyrillic) support #15
Comments
I'm not sure if I understand your question. If you're asking whether languages that use Cyrillic script are supported, both Russian and Ukrainian should already be supported, and we can add support for additional languages that use Cyrillic script as requested. You can set these languages using the language codes |
Thank you so much for taking the time to help with my question! I apologize for the confusion in my initial message—I realize now that I may not have explained the issue clearly and might have led you in the wrong direction. I really appreciate your support and all the work you’ve put into this project. Let me clarify: After some research, I discovered that it’s possible to load custom Tesseract.recognize('path/to/image.png', 'Cyrillic', {
langPath: 'path/to/script/folder/with/Cyrillic.traineddata.gz'
}).then(({ data: { text } }) => {
console.log(text);
}); However, it seems that Is there any plan to add support for a custom langPath argument in scribe.js? This would allow for greater flexibility when working with specific scripts or custom language files. |
Can you explain specifically what you are looking to accomplish with this? Specifically, are you trying to use custom For context, if I simply exposed the However, exposing |
The purpose of customizing |
Thanks for explaining, I was not familiar with the "script" |
Thank you for your project! I am interested in the question, should we expect support for the use of scripts? As far as I understand, now we can transfer the code of any language to the tesseract, but not all languages covered by the tesseract are represented exactly as languages. For example, many Cyrillic languages can be recognized using a tesseract, but only if you pass a script as an argument, not a language. Thank you!
The text was updated successfully, but these errors were encountered: