Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine creation error #61

Open
FredrikAnderssonRV opened this issue Sep 10, 2024 · 1 comment
Open

Engine creation error #61

FredrikAnderssonRV opened this issue Sep 10, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@FredrikAnderssonRV
Copy link

Hi!

I am developing a webapplication (.NetFramework 4.8) and I am in need of getting text from a pdf-image-page.

I have a strange problem with the program.
I can get the code to work one or two times then I get an error at Engine creation, "The path has not a valid format.". After this I have to completely uninstall TesseractOCR from my project and reinstall to get it working for on or two times again. I might have to do the uninstall reinstall multiple times. I have traced the error, it is thrown at line 133 in Engine.cs, "DefaultPageSegMode = PageSegMode.Auto;". I can not see how this can throw a Path not valid error though?

The code that raises the error:
engine = new TesseractOCR.Engine(Server.MapPath(@"~/tessdata"), TesseractOCR.Enums.Language.Swedish);

var img1 = TesseractOCR.Pix.Image.LoadFromFile(@"C:\TEMP\eurotext.png");
TesseractOCR.Page xpage = engine1.Process(img1);

I have the language files in the tessdata folder in my project. I hav tried to just use Server.MapPath("~") as path but then I can not get it to work at all.

Here are all my includes:
using PdfSharp;
using PdfSharp.Pdf;
using PdfSharp.Pdf.Advanced;
using PdfSharpTextExtractor;
using System;
using System.Data;
using System.Data.SqlClient;
using System.IO;
using System.Security.Principal;
using System.ServiceModel;
using System.Text;
using System.Text.RegularExpressions;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using TesseractOCR;
using TesseractOCR.Enums;
using static System.Windows.Forms.VisualStyles.VisualStyleElement;

I hope you can help me.
Best regards
Fredrik Andersson

@FredrikAnderssonRV
Copy link
Author

Solved it! I'm using Costura.Fody, TesseractOCR is not working when the TesseractOCR.dll gets embedded. My solution was to add the TesseractOCR.dll manually to my project an set it to "copy always" under properties.
Best regards
Fredrik Andersson

@Sicos1977 Sicos1977 added the question Further information is requested label Sep 18, 2024
@Sicos1977 Sicos1977 self-assigned this Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants