ImageEn for Delphi and C++ Builder ImageEn for Delphi and C++ Builder

 

ImageEn Forum
Profile    Join    Active Topics    Forum FAQ    Search this forumSearch
 All Forums
 ImageEn Library for Delphi, C++ and .Net
 ImageEn and IEvolution Support Forum
 PdfViewer Find & GetTextRects

Note: You must be registered in order to post a reply.
To register, click here. Registration is FREE!

View 
UserName:
Password:
Format  Bold Italicized Underline  Align Left Centered Align Right  Horizontal Rule  Insert Hyperlink   Browse for an image to attach to your post Browse for a zip to attach to your post Insert Code  Insert Quote Insert List
   
Message 

 

Emoji
Smile [:)] Big Smile [:D] Cool [8D] Blush [:I]
Tongue [:P] Evil [):] Wink [;)] Black Eye [B)]
Frown [:(] Shocked [:0] Angry [:(!] Sleepy [|)]
Kisses [:X] Approve [^] Disapprove [V] Question [?]

 
Check here to subscribe to this topic.
   

T O P I C    R E V I E W
aleatprog Posted - Feb 04 2023 : 11:22:46
Hi

I like to identify a specific text position inside a PDF table in order to extract a value which is in a relative position to the search term.

ImageEnView1.PdfViewer.Find(SearchTerm, False, True, False, False, True) highlights correctly the search term, so I thought to use GetTextRects to get the TextRect information of the search term and add then the relative XY distance of the text to extract. Is that the correct way?

If yes, how can I get the CharIndex from the calculated XY position on the PDF page (not on the screen as for ScrToCharIndex)?

Al
3   L A T E S T    R E P L I E S    (Newest First)
xequte Posted - Feb 08 2023 : 14:56:50
Hi Al

Sorry, I missed the most important part of your first message.

I will expose the following methods in the coming update (ready next week probably):

PdfViewer.ScrToPage();
PdfViewer.PageToScr();

PdfViewer.CurrentPage.Width gives a raw PDFium value, so it may not be the same as PdfViewer.PageWidth, e.g. when the DPI has been changed. Generally you should only use PdfViewer.PageWidth.


Nigel
Xequte Software
www.imageen.com
aleatprog Posted - Feb 08 2023 : 04:52:05
Hi Nigel,

thank you for reply. I'd tried the above code without success before opening this thread. Meanwhile I solved the ScrToCharIndex coordinates problem by processing the file in background by a temporary TImageEnView object using the following code:

memo1.Clear;
ImageEnView1 := TImageEnView.Create(Self);
try
  ImageEnView1.PdfViewer.Enabled := True;
  ImageEnView1.PdfViewer.LoadFromFile(path);
  s := InputBox('Find', 'search term', '');
  ImageEnView1.PdfViewer.Find(s, False, False, False, False, False);
  while ImageEnView1.PdfViewer.FindNext( wordIdx, wordLen, False, False ) do
    rects := rects + ImageEnView1.PdfViewer.GetTextRects( wordIdx, wordLen );
  for i := Low(rects) to High(rects) do
    begin
      CharIndex := ImageEnView1.PdfViewer.ScrToCharIndex(dstX, rects[i].Top, 2.0, 1.0);
      ImageEnView1.PdfViewer.CharIndexToWord(CharIndex, TextIndex, TextLength);
      memo1.Lines.Add(ImageEnView1.PdfViewer.GetText(CharIndex, TextLength));
    end;
finally
  ImageEnView1.Free;
end;


Comparing the coordinates with their relative page position created an issue:
ImageEnView1.PdfViewer.PageWidth <> ImageEnView1.PdfViewer.CurrentPage.Width

ImageEnView1.PdfViewer.PageWidth returns the correct value.

Al
xequte Posted - Feb 04 2023 : 21:05:26
Hi Al

With TIEPdfViewerInteraction.Find() if you set the DoSelect to false, it becomes non-visual.

You can then use FindNext() to get the CharIndex and Count

http://www.imageen.com/help/TIEPdfViewerInteraction.FindNext.html

And GetTextRects() to convert those values into rects:

http://www.imageen.com/help/TIEPdfViewerInteraction.GetTextRects.html

// Get the location of the "Adobe" on the page
var
  wordIdx, wordLen, i: Integer;
  rects : TIERectArray;
begin
  memo1.Lines.Clear();

  // Get the location of the "Adobe" by index and count
  ImageEnView1.PdfViewer.Find( 'Adobe', False, False, False, False );
  ImageEnView1.PdfViewer.FindNext( wordIdx, wordLen, False, False );

  // Convert index and count to rects
  rects := ImageEnView1.PdfViewer.GetTextRects( wordIdx, wordLen );
  if rects = nil then
    memo1.Lines.Add( 'NOT FOUND!' )
  else
  for i := Low( rects ) to High( rects ) do
    memo1.Lines.Add( format( '(%d, %d, %d, %d)', [ rects[i].Left, rects[i].Top, rects[i].Right, rects[i].Bottom ]));
end;


Nigel
Xequte Software
www.imageen.com