ImageEn for Delphi and C++ Builder ImageEn for Delphi and C++ Builder

 

ImageEn Forum
Profile    Join    Active Topics    Forum FAQ    Search this forumSearch
Forum membership is Free!  Click Join to sign-up
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 ImageEn Library for Delphi, C++ and .Net
 ImageEn and IEvolution Support Forum
 WPCubed MakeImage plug in
 New Topic  Reply to Topic
Author Previous Topic Topic Next Topic  

jrpcguru

USA
254 Posts

Posted - Aug 29 2017 :  16:08:32  Show Profile  Reply
I can use the following sample code to save metadata into a .PDF file that would normally save into IPTC data for a .JPG or .TIF:

ImageEnMView1.MIO.ParamS[0].PDF_Title :=IPTCTitle.Text; //title
ImageEnMView1.MIO.ParamS[0].PDF_Creator :=IPTCauthor.Text; //location
ImageEnMView1.MIO.ParamS[0].PDF_Author :=IPTCSource.Text; //source
ImageEnMView1.MIO.ParamS[0].PDF_Subject :=IPTCauthorTitle.Text; //date
ImageEnMView1.MIO.ParamS[0].PDF_Keywords :=IPTCCaption.Text; //description

I have not found a way of reading the metadata when reloading a PDF file using the MakeImage plugin. for example:

ImageEnMView1.mio.Params[idx].PDF_Title is blank, even if Adobe Reader can display the matching metadata.

J.R.

xequte

38182 Posts

Posted - Aug 29 2017 :  17:18:38  Show Profile  Reply
Hi JR

Yes, the WPDF Plug-In does not fill those fields. You might want to make a request to Julian.



Nigel
Xequte Software
www.xequte.com
nigel@xequte.com
Go to Top of Page

jrpcguru

USA
254 Posts

Posted - Sep 20 2017 :  16:15:23  Show Profile  Reply
I still haven't been given permission to join the WPCubed forum. Thanks to your assistance, I've made good progress on PDF saving and reloading. I have used a hexeditor to try to figure out the format of PDF metadata and found quite a bit of variation. I offer the following code for reading PDF metadata. I'm hoping others will benefit from it and perhaps even contribute improvements. It does not read all types of metadata, but covered a pretty good sampling of files. It could be modified to limit it to only reading the metadata format used by ImageEn, since that appears to be consistent for at least several years.

function ExtractPDFMetadata(const aPDFFileName: TFileName; out outPDF_Title : string;
  out outPDF_Author:string; out outPDF_Subject:string; out outPDF_Keywords: string;
  out outPDF_Creator : string; out outPDF_Producer:string): UTF8String;
//modified from StackOverFlow:  https://stackoverflow.com/questions/7130287/easiest-way-to-read-pdf-a-metadata-from-a-delphi-app?rq=1
//Does not read compressed or encrypted PDF metadata
//Does not parse metadata stored as XMP - very little of this was found. does return the Producer value if XMP is found
var tmp: RawByteString;
    i, j, k: integer;
    sMsg : string;
    iStartingPoint : integer;
//==============================================
function subHexToString(H: String): String;
//only a few samples needed this.
var I: Integer;
begin
  Result:= '';
  for I := 1 to length (H) div 2 do
    Result:= Result+Char(StrToInt('$'+Copy(H,(I-1)*2+1,2)));
end;

function subRemoveNull ( str : string) : string;
//only a few samples needed this
var
  n : integer;
begin
n := 1;
while n <= Length(str) do
  begin
    if str[n] = #0 then
      begin
        Delete(str, n, 1);
        Continue;
      end;
    inc(n);
  end;
  result := str;
end;
//===============================================
function SubReadValue(inOpenTag : string; inCloseTag : string; inStartingPoint : integer) : string;
var
    i, j: integer;
    iTagStart : integer;
    sRaw : RawByteString;
    sResult : string;
begin
  result := '';
  if inStartingPoint > 0  then
    inStartingPoint := inStartingPoint -1;
  iTagStart := PosEx(inOpenTag,tmp, inStartingPoint);


  if (iTagStart <> 0) then   //find end tag
    begin
      j := iTagStart + Length(inOpenTag)-2 ;  //starting point to read tag
      i := PosEx(inCloseTag,tmp,j+1);          //all these different tag formats have been seen in PDF files via HexEditor

      if (i > 0) and (j > 0)  then
        begin
        sRaw := AnsiMidStr(tmp,j+2,i-j -2);
        sResult := AnsiLeftStr(sRaw,4);
        if sResult = 'FEFF' then
          begin
            sRaw := AnsiMidStr(sRaw,5, length(sRaw));
            sRaw := subHexToString(sRaw);
          end
        else
          begin
            sRaw := AnsiReplaceText(sRaw,#255,'');
            sRaw := AnsiReplaceText(sRaw,#254,'');

          end;
        sResult := sRaw;
        Result := subRemoveNull((sResult));
        result := AnsiReplaceText(result,'\(','(');  //fix code to differentiate actual ( from delimiter
        result := AnsiReplaceText(result,'\)',')');
        result := AnsiReplaceText(result,'\<','(');  //fix code to differentiate actual < from delimiter
        result := AnsiReplaceText(result,'\>',')');
        end;
    end;

  end;
//===============================================
function SubFindProducer: string;
//search for XMP metadata
var
    i, j: integer;
    iTagStart : integer;
begin
  result := '';
  iTagStart := Pos('<pdf:Producer>',tmp);
  if iTagStart > 0 then
    begin
      i :=  iTagStart;
      while i <> 0 do
        begin
          i := PosEx('<pdf:Producer>',tmp, iTagStart + 1);
          if i > 0 then
            iTagStart := i

        end;
    end;

  if (iTagStart <> 0) then
    begin
      i := PosEx('</pdf:Producer>',tmp, iTagStart + 1);
      j := iTagStart + Length('<pdf:Producer>');

      if (i > 0) and (j > 0)  then
        result := AnsiMidStr(tmp,j,i-j);
    end;

end;
//===============================================
//===============================================
function SubFindStartingPoint(inOpenTag : string) : integer;
var
    i: integer;
    iTagStart : integer;
begin
  result := 0;
  iTagStart := Pos(inOpenTag,tmp);

  if iTagStart > 0 then
    begin
      i :=  iTagStart;
      while i <> 0 do     //now search for additional metadata blocks
        begin
          i := PosEx(inOpenTag ,tmp, iTagStart + 1);
          if i > 0 then
            iTagStart := i ;

        end;
    end;
    result := iTagStart;
  end;
//===============================================
function subChooseTagFormat(inTagName : string; inStartingPoint : integer) : string;
var
    iTagStart : integer;
    iEndTag, j : integer;
    sResult : RawByteString;
begin
  sResult := SubReadValue(inTagName + '(' + #254+#255 + ' ', ')/', inStartingPoint);     //Test each logical combination of delimiters
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(' + #254+#255 + ' ', ')>', inStartingPoint);     //Test each logical combination of delimiters
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(', ')/', inStartingPoint);     //Test each logical combination of delimiters
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(', ') /', inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(', ')' + #10 + '/', inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(', ')' + #13 + '/', inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(', ')', inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(', ')>', inStartingPoint);

  if sResult = '' then
    sResult := SubReadValue(inTagName + ' (', ')' + #10 + '/', inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + ' (', ')' + #13 + '/', inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + ' (', ') /', inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + ' (', ')', inStartingPoint);

  if sResult = '' then
    sResult := SubReadValue(inTagName + ' <', '>' + #13, inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + ' <', '>', inStartingPoint);

if sResult <> '' then //some end tags do not match. Find the earliest end tag in Result and use it to extract correct value
  begin
      iEndTag := PosEx(')/',sResult,1);
      if iEndTag > 0 then
        sResult := AnsiMidStr(sResult,1,iEndTag - 1);

      iEndTag := PosEx(') /',sResult,1);
      if iEndTag > 0 then
        sResult := AnsiMidStr(sResult,1,iEndTag - 1);

      iEndTag := PosEx(')' + #10 + '/',sResult,1);
      if iEndTag > 0 then
        sResult := AnsiMidStr(sResult,1,iEndTag - 1);

      iEndTag := PosEx(')' + #13 + '/',sResult,1);
      if iEndTag > 0 then
        sResult := AnsiMidStr(sResult,1,iEndTag - 1);

      iEndTag := PosEx(') /',sResult,1);
      if iEndTag > 0 then
        sResult := AnsiMidStr(sResult,1,iEndTag - 1);

      iEndTag := PosEx(')' + #13 + '>',sResult,1);
      if iEndTag > 0 then
        sResult := AnsiMidStr(sResult,1,iEndTag - 1);

      iEndTag := PosEx('>' + #13,sResult,1);
      if iEndTag > 0 then
        sResult := AnsiMidStr(sResult,1,iEndTag - 1);

      iEndTag := PosEx('>' ,sResult,1);
      if iEndTag > 0 then
        sResult := AnsiMidStr(sResult,1,iEndTag - 1);

       Result := sResult;
  end;

end;
//===============================================
  begin   //ExtractPDFMetadata()
  with TFileStream.Create(aPDFFileName,fmOpenRead) do
  try
    SetLength(tmp,Size);
    Read(tmp[1],Size);
  finally
    Free;
  end;
  result := '';
//  tmp := AnsiRightStr(tmp,10000);  //some metadata turns up at beginning of file so can't shorten the string
  iStartingPoint := SubFindStartingPoint('<x:xmpmeta');
  if iStartingPoint > 0 then
    begin
      outPDF_Producer :=  SubFindProducer;   //only report Producer value from XMP.
      exit;
    end;

  iStartingPoint := SubFindStartingPoint('/Title');
  if iStartingPoint = 0 then iStartingPoint := Length(tmp);  //try to find things before the EOF

  i := SubFindStartingPoint('/Author');
  if (i > 0) and (i < iStartingPoint) then iStartingPoint := i;

  i := SubFindStartingPoint('/Subject');
  if (i > 0) and (i < iStartingPoint) then iStartingPoint := i;

  i := SubFindStartingPoint('/Keywords');
  if (i > 0) and (i < iStartingPoint) then iStartingPoint := i;

  i := SubFindStartingPoint('/Creator');
  if (i > 0) and (i < iStartingPoint) then iStartingPoint := i;

  i := SubFindStartingPoint('/Producer');
  if (i > 0) and (i < iStartingPoint) then iStartingPoint := i;

    if iStartingPoint = Length(tmp) then
      exit;
  iStartingPoint := iStartingPoint -1;  //start looking just before first tag

  outPDF_Title := subChooseTagFormat('/Title', iStartingPoint);     //Test each logical combination of delimiters
  outPDF_Author := subChooseTagFormat('/Author', iStartingPoint);     //Test each logical combination of delimiters
  outPDF_Subject := subChooseTagFormat('/Subject', iStartingPoint);     //Test each logical combination of delimiters
  outPDF_Keywords := subChooseTagFormat('/Keywords', iStartingPoint);     //Test each logical combination of delimiters
  outPDF_Creator := subChooseTagFormat('/Creator', iStartingPoint);     //Test each logical combination of delimiters
  outPDF_Producer := subChooseTagFormat('/Producer', iStartingPoint);     //Test each logical combination of delimiters



  result := 'Title = ' + outPDF_Title + sLineBreak + 'Author = ' + outPDF_Author + sLineBreak +
    'Subject = ' + outPDF_Subject + sLineBreak + 'Keywords = ' + outPDF_Keywords + slineBreak +
    'Creator = ' + outPDF_Creator + sLineBreak + 'Producer = ' + outPDF_Producer;


end;



Hopefully this is helpful to someone.

J.R.
Go to Top of Page

jrpcguru

USA
254 Posts

Posted - Jan 04 2018 :  14:10:33  Show Profile  Reply
I have now managed to improve the PDF metadata code that I offered back in Sept. It now avoids Out of Memory errors for large files. It translates several more formats of metadata, including what is probably a crude XML translation. I don't have many samples of XML metadata to test with. Judging from the lack of response to my original posts, it would seem that reading metadata is not a high priority for your users.


function ExtractPDFMetadata(const inPDFFileName: TFileName; out outPDF_Title : string;
  out outPDF_Author:string; out outPDF_Subject:string; out outPDF_Keywords: string;
  out outPDF_Creator : string; out outPDF_Producer:string): UTF8String;
//modified from StackOverFlow:  https://stackoverflow.com/questions/7130287/easiest-way-to-read-pdf-a-metadata-from-a-delphi-app?rq=1
//Does not read compressed or encrypted PDF metadata
//Parses some metadata stored as XMP - very little of this was found to test with.
var
    tmp: RawByteString;
    iReadStart : Int64;
    i, j: integer;
    iStartingPoint : integer;
    sXMLProducer : string;
    iRetryCount : integer;

    iBufferSize : integer;
  sCaption : string;

//==============================================
function subHexToString(H: String): String;
//only a few samples needed this.
var
  I: Integer;
  j: integer;
  sTemp : string;
begin
  Result:= '';
  try
  for I := 1 to length (H) div 2 do
    begin
    sTemp := Copy(H,(I-1)*2+1,2);
    j := StrToInt('$'+ sTemp);
    result := Result + Char(j);
    end;
  except
//    result := '';    //retain current data up until error
  end;
end;

function subRemoveNull ( str : string) : string;
//only a few samples needed this
var
  n : integer;
begin
n := 1;
while n <= Length(str) do
  begin
    if str[n] = #0 then
      begin
        Delete(str, n, 1);
        Continue;
      end;
    inc(n);
  end;
  result := str;
end;

function subTranslatePrimo(str : string) : string;
//sample from PrimoPDF printer driver
begin
  result := str;
  if PosEx('\376\377\000',str, 1) > 0 then
    begin
      if AnsiLeftStr(str,12) = '\376\377\000' then
        begin
          str := AnsiMidStr(str,13, length(str));
          result := AnsiReplaceText(str,'\000','');    //remove \000

        end;
    end;
end;

function SubGetEndTag(inStart : integer) : integer;
var
  iEndTag : integer;

begin
  result := Length(tmp);
      iEndTag := PosEx('>' + #10,tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx(')/',tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx(') /',tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx(')' + #10 + '/',tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx(')' + #13 + '/',tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx(') /',tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx(')>',tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx(')' + #13 + '>',tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx('>' + #13,tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;

      iEndTag := PosEx(')' + #10,tmp,inStart);
      if (iEndTag > 0) and (iEndTag < result) then
        Result := iEndTag;


end;
//===============================================
function SubReadValue(inOpenTag : string; inStartingPoint : integer) : string;
var
    i, j: integer;
    iTagStart : integer;
    sRaw : RawByteString;
    sResult : string;
begin
  result := '';
  if inStartingPoint > 0  then
    inStartingPoint := inStartingPoint -1;
  iTagStart := PosEx(inOpenTag,tmp, inStartingPoint);


  if (iTagStart <> 0) then   //find end tag
    begin
      j := iTagStart + Length(inOpenTag)-2 ;  //starting point to read tag
      i := SubGetEndTag(j+1);

      if (i > 0) and (j > 0)  then
        begin
        sRaw := AnsiMidStr(tmp,j+2,i-j -2);
        sResult := AnsiLeftStr(sRaw,4);
        if sResult = 'FEFF' then
          begin
            sRaw := AnsiMidStr(sRaw,5, length(sRaw));
            sRaw := subHexToString(sRaw);
          end
        else
          begin
            sRaw := AnsiReplaceText(sRaw,#255,'');
            sRaw := AnsiReplaceText(sRaw,#254,'');

          end;
        sResult := sRaw;
        Result := subRemoveNull((sResult));
        result := subTranslatePrimo(result);
        end;
    end;

  end;
//===============================================

//===============================================
//===============================================
function SubFindXMLdata(inStartTag : string; inStopTag : string; inDataToSearch : string) : string;
//search for XML metadata within inDataToSearch data from FileStream
//start by searching inDataToSearch = tmp, if XML found then use it again with just the XMP data as the input
var
    i, j: integer;
    iTagStart : integer;
begin
  result := '';
  i := PosEx(inStartTag,inDataToSearch, 1);  //find first occurrence of starting tag
  if i > 0 then
    iTagStart := i ;
  if iTagStart > 0 then   //now see if there are any more of this starting tag
    begin
      i :=  iTagStart;
      while i <> 0 do
        begin
          i := PosEx(inStartTag,inDataToSearch, iTagStart + 1);
          if i > 0 then
            iTagStart := i

        end;
    end;

  if (iTagStart <> 0) then
    begin
      i := PosEx(inStopTag,inDataToSearch, iTagStart + 1);
      j := iTagStart + Length(inStartTag);

      if (i > 0) and (j > 0)  then
        result := AnsiMidStr(inDataToSearch,j,i-j);
    end;
  if (result <> '') then // remove XML tags  //so far the only tags I have seen in files
    begin
      result := StringReplace(result,'<rdf:Alt>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >
      result := StringReplace(result,'</rdf:Alt>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >

      result := StringReplace(result,'<rdf:li>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >
      result := StringReplace(result,'<rdf:li/>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >

      result := StringReplace(result,'<rdf:li xml:lang="x-default">','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >
      result := StringReplace(result,'<rdf:li xml:lang="x-default"/>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >

      result := StringReplace(result,'</rdf:li>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >

      result := StringReplace(result,'<rdf:Bag>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >
      result := StringReplace(result,'</rdf:Bag>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >

      result := StringReplace(result,'<rdf:Seq>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >
      result := StringReplace(result,'</rdf:Seq>','',[rfReplaceAll,rfIgnoreCase]);  //restore XMP representations of >
      result := trim(result);
    end;

end;
//===============================================
//===============================================
function SubFindStartingPoint(inOpenTag : string) : integer;
var
    i: integer;
    iTagStart : integer;
begin
  result := 0;
  iTagStart := Pos(inOpenTag,tmp);

  if iTagStart > 0 then
    begin
      i :=  iTagStart;
      while i <> 0 do     //now search for additional metadata blocks, selecting the last one
        begin
          i := PosEx(inOpenTag ,tmp, iTagStart + 1);
          if i > 0 then
            iTagStart := i ;

        end;
    end;
    result := iTagStart;
  end;
//===============================================
function subChooseTagFormat(inTagName : string; inStartingPoint : integer) : string;
var
    iTagStart : integer;
    iEndTag, j : integer;
    sResult : RawByteString;
begin
//  sResult := SubReadValue(inTagName + '(' + #254+#255 + ' ', ')/', inStartingPoint);     //Test each logical combination of delimiters
  sResult := SubReadValue(inTagName + '(' + #254+#255 + #0, inStartingPoint);     //Test each logical combination of delimiters
  if sResult = '' then
    sResult := SubReadValue(inTagName + ' <' + #254+#255 , inStartingPoint);     //Test each logical combination of delimiters
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(' + #254+#255 + ' ', inStartingPoint);     //Test each logical combination of delimiters
  if sResult = '' then
    sResult := SubReadValue(inTagName + ' <',  inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + '<',  inStartingPoint);     //Test each logical combination of delimiters
  if sResult = '' then
    sResult := SubReadValue(inTagName + ' (', inStartingPoint);
  if sResult = '' then
    sResult := SubReadValue(inTagName + '(',  inStartingPoint);     //Test each logical combination of delimiters

    Result := sResult;
    //ImageEn converts ( to \( and ) to \) so we need to convert back
    result := AnsiReplaceText(result,'\(','(');  //fix code to differentiate actual ( from delimiter
    result := AnsiReplaceText(result,'\)',')');
    result := AnsiReplaceText(result,'\\','\');  //fix code to differentiate actual \ from delimiter

end;

function subReadSnagItStyleTags(inStartingPoint : integer; inTag : string) : string;
var
  sDataNumber : string;
  sTemp : string;
  I,J : integer;
begin
  result := '';
  I := PosEx(inTag ,tmp, inStartingPoint);
  if I > 0 then
    begin
      sDataNumber := AnsiMidStr(tmp,I + length(inTag) + 1,3);
      sDataNumber := trim(sDataNumber); //may be 2 or 3 character code
      sDataNumber := sDataNumber + ' 0 obj' + #13 +  '(';
      sTemp := AnsiMidStr(tmp,I,length(Tmp));
    end;
  I := PosEx(sDataNumber,sTemp, length(inTag) + 6);
  I := I + length(sDataNumber);
  if I > 0 then
    begin
      J := PosEx(')' + #13 + 'endobj',sTemp, I);
      result := AnsiMidStr(sTemp, I , J - I);
    end;
end;

Procedure subLoopThruMetaData;
var
  sXMLData : string;
begin
          i := SubFindStartingPoint('/Title');    //search in Tmp
          if (i > 0) then
            outPDF_Title := subChooseTagFormat('/Title', I);

          i := SubFindStartingPoint('/Author');
          if (i > 0) then
            outPDF_Author := subChooseTagFormat('/Author', I);

          i := SubFindStartingPoint('/Subject');
          if (i > 0) then
            outPDF_Subject := subChooseTagFormat('/Subject', I);

          i := SubFindStartingPoint('/Keywords');
          if (i > 0) then
            outPDF_Keywords := subChooseTagFormat('/Keywords', I);

          i := SubFindStartingPoint('/Creator');
          if (i > 0) then
            outPDF_Creator := subChooseTagFormat('/Creator', I);

          i := SubFindStartingPoint('/Producer');
          if (i > 0) then
            outPDF_Producer := subChooseTagFormat('/Producer', I)
          else
            begin
              sXMLData := SubFindXMLdata('<x:xmpmeta', '</x:xmpmeta' ,Tmp);

              if sXMLData <> '' then
                begin
                  sXMLProducer := SubFindXMLdata('<pdf:Producer>', '</pdf:Producer>', sXMLData);
                  outPDF_Producer := sXMLProducer;
                  if outPDF_Title = '' then
                    outPDF_Title := SubFindXMLdata('<dc:title>', '</dc:title>', sXMLData);
                  if outPDF_Author = '' then
                    outPDF_Author := SubFindXMLdata('<dc:creator>', '</dc:creator>', sXMLData);
                  if outPDF_Subject = '' then
                    outPDF_Subject := SubFindXMLdata('<dc:description>', '</dc:description>', sXMLData);

                  if outPDF_Creator = '' then
                    outPDF_Creator := SubFindXMLdata('<xap:CreatorTool>', '</xap:CreatorTool>', sXMLData);
                  if outPDF_Creator = '' then
                    outPDF_Creator := SubFindXMLdata('<xmp:CreatorTool>', '</xmp:CreatorTool>', sXMLData);

                  if outPDF_Keywords = '' then
                    outPDF_Keywords := SubFindXMLdata('<pdf:Keywords>', '</pdf:Keywords>', sXMLData);

//                  if (trim(outPDF_Title) + trim(outPDF_Author) + trim(outPDF_Subject) + trim(outPDF_Keywords) +
//                    trim(outPDF_Creator)) <> '' then
//                       outPDF_Producer := outPDF_Producer +  ' - XML Metadata found';
                end;
              end;


          if (trim(outPDF_Title) + trim(outPDF_Author) + trim(outPDF_Subject) + trim(outPDF_Keywords) +
              trim(outPDF_Creator)  + trim(outPDF_Producer) = '' ) then  //see if it follows SnagIt format
            begin
              i := SubFindStartingPoint('<</Author');
              outPDF_Producer := subReadSnagItStyleTags(I,'/Producer');
              outPDF_Title := subReadSnagItStyleTags(I,'/Title');
              outPDF_Author := subReadSnagItStyleTags(I,'/Author');
              outPDF_Subject  := subReadSnagItStyleTags(I,'/Subject');
              outPDF_Keywords  := subReadSnagItStyleTags(I,'/Keywords');
              outPDF_Creator  := subReadSnagItStyleTags(I,'/Creator');//didn't find this in my samples
            end;

          if (trim(outPDF_Title) + trim(outPDF_Author) + trim(outPDF_Subject) + trim(outPDF_Keywords) +
             trim(outPDF_Creator) = '') then            //+ trim(outPDF_Producer) = '' )
            begin
              iReadStart := iReadStart - (1024*190);  //allow overlap with prior read
              if iReadStart < 1 then
                begin
                  iBufferSize := iBufferSize + iReadStart - 1; //only the remaing file to search
                  iReadStart := 1;
                end;
            end;

end;
//===============================================
  begin   //ExtractPDFMetadata()
  iBufferSize := 1024*200;
  try
    with TFileStream.Create(inPDFFileName,fmOpenRead or fmShareDenyNone) do
    try
      iReadStart := Size - iBufferSize;
      if iReadStart <=0 then
        begin
          iReadStart := 1;  //buffer is bigger than the file
          iBufferSize := size;
        end;
      while iReadStart >= 1 do

        begin
          Seek(iReadStart, soFromBeginning);
          SetLength(tmp,iBufferSize);
          Read(tmp[1],iBufferSize);
          subLoopThruMetaData;
          if (trim(outPDF_Title) + trim(outPDF_Author) + trim(outPDF_Subject) + trim(outPDF_Keywords) +
          trim(outPDF_Creator)  + trim(outPDF_Producer) <> '' )  then
              break;
          if (iReadStart = 1) then
            begin
              Seek(iReadStart, soFromBeginning);
              SetLength(tmp,iBufferSize);
              Read(tmp[1],iBufferSize);
              subLoopThruMetaData;
              break;
            end;
        end;
    finally
      Free;
    end;
  except
    begin
      outPDF_Title := 'Unable to read Title';
      outPDF_Author := 'Unable to read Source';
      outPDF_Subject := 'Unable to read Date';
      outPDF_Keywords := 'Unable to read Description';
      outPDF_Creator := 'Unable to read Location';
      outPDF_Producer := 'Unable to read Software Name';
      exit;
    end;

  end;
  result := '';
  result := '';


  result := 'Title = ' + outPDF_Title + sLineBreak + 'Author = ' + outPDF_Author + sLineBreak +
    'Subject = ' + outPDF_Subject + sLineBreak + 'Keywords = ' + outPDF_Keywords + slineBreak +
    'Creator = ' + outPDF_Creator + sLineBreak + 'Producer = ' + outPDF_Producer;

end;



J.R.
Go to Top of Page

xequte

38182 Posts

Posted - Jan 04 2018 :  20:45:17  Show Profile  Reply
Thanks JR,

I will test it out. You may not have got a lot of responses, but there were a lot of reads of your post.

Nigel
Xequte Software
www.imageen.com
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
Jump To: