Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REQUEST] Support Google Cloud Vision API for OCR #116

Open
SnowySailor opened this issue Dec 19, 2024 · 0 comments
Open

[REQUEST] Support Google Cloud Vision API for OCR #116

SnowySailor opened this issue Dec 19, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@SnowySailor
Copy link

The manga_ocr model works decently well, but still messes things up quite often in my experience, even with high resolution manga panels. Specifically if there are any obscure/dense kanji it sometimes doesn't recognize them, and it often omits dakuten over kana characters or has trouble differentiating between small/large characters in words like しょっちゅう. This could be an artifact of the comic text extractor, but it's unclear to me specifically what's causing it.

I use the google cloud vision API for a personal project of mine and I can count on one hand the number of times it has messed something up in the last few months of me using it heavily.

Someone using the vision API would just have to provide their API credentials as an argument and it would use the vision API instead of manga_ocr. The vision API should return blocks with dimensions/locations of text, so it should be possible to mutate that into the mokuro json format.

@kha-white kha-white added the enhancement New feature or request label Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants