How to process a txtUnstructured image/document with paragraphAsOneLine? #94

felipedaraujo · 2020-10-20T13:47:57Z

I successfully run the JavaScript example in this repo and now I trying to use the parameter txtUnstructured:paragraphAsOneLine, but so far I haven't had any luck.

After line 87 I tried all the options below and none of them worked for me. Could you guide me on how to use this parameter in the correct way?

settings.language = "English"; // Can be comma-separated list, e.g. "German,French".
settings.exportFormat = "txtUnstructured";

// Alternative 1 - Didn't work
// settings["txtUnstructured:paragraphAsOneLine"] = "true";

// Alternative 2 - Didn't work
// settings["txtUnstructured:paragraphAsOneLine"] = true;

// Alternative 3 - Didn't work
// settings.txtUnstructured = { paragraphAsOneLine: true };

// Alternative 4 - Didn't work
// settings.txtUnstructured = { paragraphAsOneLine: "true" };

// Alternative 5 - Didn't work
// settings.paragraphAsOneLine = "true";

// Alternative 6 - Didn't work
// settings.paragraphAsOneLine = true;

https://cloud-westus.ocrsdk.com is the service target I am using.

My ultimate goal is to parse a PDF to txt the same way finereaderonline.com does, converting multiple columns to a single column and ignoring footers/page numbers.

Thanks in advance.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to process a txtUnstructured image/document with paragraphAsOneLine? #94

How to process a txtUnstructured image/document with paragraphAsOneLine? #94

felipedaraujo commented Oct 20, 2020

How to process a txtUnstructured image/document with paragraphAsOneLine? #94

How to process a txtUnstructured image/document with paragraphAsOneLine? #94

Comments

felipedaraujo commented Oct 20, 2020