ImageEn, unit ievision

TIEVisionOCR.getStructure

TIEVisionOCR.getStructure


Declaration

function getStructure(format: TIEVisionOCRStructureFormat; pageNumber: Integer = 0): TIEVisionWString; safecall;


Description

Make a XML/HTML-formatted string with markups from the internal data structures.
Three formats are supported:
  hOCR: An open standard of data representation for formatted text obtained from OCR
  ALTO: "Analyzed Layout and Text Object" format is an XML Schema that details technical metadata for describing the layout and content of physical text resources, such as pages of a book or a newspaper
  Tab-Separated Values: Returns a text format outputs all words and their pixel-positions in the image

getStructure must be called after recognize.

Parameter Description
format Output format.
pageNumber Optional page number to embed into the output text.


Example

m_OCR.recognize(ImageEnView1.IEBitmap.GetIEVisionImage(), IEVisionRect(0, 0, 0, 0))
memo1.Lines.Text := m_OCR.getStructure(ievOCRStructHOCR).c_str();