Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to pass --psm to detect text as a single column of text? #58

Open
dheimoz opened this issue Jul 15, 2022 · 10 comments
Open

How to pass --psm to detect text as a single column of text? #58

dheimoz opened this issue Jul 15, 2022 · 10 comments
Labels
enhancement New feature or request

Comments

@dheimoz
Copy link

dheimoz commented Jul 15, 2022

Hey @robertknight ,

Great work you have been doing here. It is performing excellent in Vue 3 with Vite.
I would like to send the parameter to tesseract engine --psm 4, in order to assume line as a single column.
Sometimes, the engine assumes the text as 2 or 3 columns and the text recognized does not make sense.
More info:
https://stackoverflow.com/questions/44619077/pytesseract-ocr-multiple-config-options

I was looking through the source code, I could not find how to pass that option.

Thanks.

@robertknight
Copy link
Owner

Hello - There isn't currently an option to configure the page segmentation mode (psm). It would make sense to expose this configuration though. The API could look something like:

ocrClient.loadImage(image, {
  segmentationMode: mode,
});

Do you have an examples of images where the text columns are incorrectly recognized?

@robertknight robertknight added the enhancement New feature or request label Jul 15, 2022
@wydengyre
Copy link
Contributor

wydengyre commented Jul 20, 2022

@dheimoz this should currently work if you are using the engine API:

engine.setVariable("tessedit_pageseg_mode", "4");

@dheimoz
Copy link
Author

dheimoz commented Jul 20, 2022

Thanks, I will give it y

@fmonpelat
Copy link

fmonpelat commented Dec 17, 2022

Hi, I'm not using the engine API because i want the option to use wasm and wasm-fallback from the webworker. If i make the change to send options to the engine in the ocrClient and send you the PR are you interested in this change?

@robertknight
Copy link
Owner

robertknight commented Dec 17, 2022

Yes, I'd be willing to accept that.

@fmonpelat
Copy link

fmonpelat commented Dec 18, 2022

@robertknight im seeing that embind doesn't support overloaded functions, i changed the lib.cpp to use this function:

  OCRResult LoadImage(const Image& image, const tesseract::PageSegMode pageSegMode);

and this one to support LoadImage with only one argument:

  OCRResult LoadImage(const Image& image);

in this page says it's doesn't support overloaded functions https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#overloaded-functions
i will need to invoke LoadImage from JS with another name, I'm correct? what should be the best option?

@robertknight
Copy link
Owner

I can see a few options:

  1. Make the segmentation mode (or an options struct containing the mode) a required argument, and modify the JS code to always provide it. This seems easiest.
  2. Add a separate method that is called after LoadImage to set the segmentation mode, and call this from JS after calling loadImage

The API that lib.cpp exposes to JS does not expose Tesseract internal types/enums directly, but rather abstracts them into something that is more convenient to use from the JS side and allows Tesseract version changes to be handled entirely in lib.cpp. See for example the TextUnit enum and various small structs that are exported to JS.

@fmonpelat
Copy link

@robertknight i see that you are using the function iterator_level_from_unit to pass from the TextUnit to the tesseract type PageIteratorLevel. have you tried casting, because theres a lot of enum options for PSM...
sorry about the delay im having progress whenever i've got time

@fmonpelat
Copy link

@robertknight, here's the PR: #67

@fmonpelat
Copy link

@robertknight i see that you are using the function iterator_level_from_unit to pass from the TextUnit to the tesseract type PageIteratorLevel. have you tried casting, because theres a lot of enum options for PSM... sorry about the delay im having progress whenever i've got time

NVM i saw that you used this function to pass between types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants