You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The manga_ocr model works decently well, but still messes things up quite often in my experience, even with high resolution manga panels. Specifically if there are any obscure/dense kanji it sometimes doesn't recognize them, and it often omits dakuten over kana characters or has trouble differentiating between small/large characters in words like しょっちゅう. This could be an artifact of the comic text extractor, but it's unclear to me specifically what's causing it.
I use the google cloud vision API for a personal project of mine and I can count on one hand the number of times it has messed something up in the last few months of me using it heavily.
Someone using the vision API would just have to provide their API credentials as an argument and it would use the vision API instead of manga_ocr. The vision API should return blocks with dimensions/locations of text, so it should be possible to mutate that into the mokuro json format.
The text was updated successfully, but these errors were encountered:
The manga_ocr model works decently well, but still messes things up quite often in my experience, even with high resolution manga panels. Specifically if there are any obscure/dense kanji it sometimes doesn't recognize them, and it often omits dakuten over kana characters or has trouble differentiating between small/large characters in words like しょっちゅう. This could be an artifact of the comic text extractor, but it's unclear to me specifically what's causing it.
I use the google cloud vision API for a personal project of mine and I can count on one hand the number of times it has messed something up in the last few months of me using it heavily.
Someone using the vision API would just have to provide their API credentials as an argument and it would use the vision API instead of manga_ocr. The vision API should return blocks with dimensions/locations of text, so it should be possible to mutate that into the mokuro json format.
The text was updated successfully, but these errors were encountered: