ImageEn for Delphi and C++ Builder ImageEn for Delphi and C++ Builder

 

ImageEn Forum
Profile    Join    Active Topics    Forum FAQ    Search this forumSearch
Forum membership is Free!  Click Join to sign-up
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 ImageEn Library for Delphi, C++ and .Net
 ImageEn and IEvolution Support Forum
 IEVision OCR text position
 New Topic  Reply to Topic
Author Previous Topic Topic Next Topic  

aleatprog

142 Posts

Posted - Aug 13 2025 :  08:52:10  Show Profile  Reply
Hi,

creating a searchable PDF with ievision64.dll 8.1.6.0, the correctly extracted underlaying text terminates one character before line end. Is there a workaround to position the extracted text accurately?

1. Result:


2. Code:
ImageEnView := TImageEnView.Create(nil);
try
  ImageEnView.PdfViewer.Enabled := True;
  ImageEnView.PdfViewer.LoadFromFile(SourcePath);
  pdfGen := IEVisionLib.createSearchablePDFGenerator(PAnsiChar(AnsiString(OcrPath)), PAnsiChar(AnsiString(LanguageCode)));
  pdfGen.beginDocument(PAnsiChar(AnsiString(DestinationPath)), PAnsiChar(AnsiString(PdfTitle)));
  for i := 0 to ImageEnView.PdfViewer.PageCount - 1 do
    begin
      IEBitmap := TIEBitmap.Create;
      try
        ImageEnView.PdfViewer.PageIndex := i;
        ImageEnView.PdfViewer.DrawTo(IEBitmap);
        pdfGen.addPage(IEBitmap.GetIEVisionImage());
      finally
        IEBitmap.Free;
      end;
    end;
  pdfGen.endDocument();
finally
  ImageEnView.Free;
end;


Every hint is appreciated. : )
Ale

xequte

39110 Posts

Posted - Aug 13 2025 :  16:49:52  Show Profile  Reply
Hi Ale

So the way Tesseract creates these PDFs is to out the original [image] content and overlay it with hidden text (captured by the OCR). It works quite well, but is not perfect due to the vagaries of fonts and scaling. Viewers make a big difference. Ironically, I find that Adobe Viewer often gives this "one-character off" effect, whereas Chrome works perfectly.



Nigel
Xequte Software
www.imageen.com
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
Jump To: