Project

General

Profile

Multiple XMP sections

Added by Goran Nagy about 5 years ago

I was wondering if I can get all XMP sections with Exiv2.
Using:

image->readMetadata();
Exiv2::XmpData &xmpData = image->xmpData();

gives me only one XMP section (not sure if it is first or last). Can I somehow get all of them?

I am not talking about one section spread over multiple APP1, but some images have several small XMP sections.


Replies (5)

RE: Multiple XMP sections - Added by Robin Mills about 5 years ago

I'm not sure that I understand what you mean by "one section". The version of XMPsdk embedded in Exiv2 does not support "Extended" XMP. You can dump the structure of the JPG with the command: $ exiv2 -pS path-to-image You can extract the XMP with the command: $ exiv2 -pX path-to-image > foo.xml The code for -pX is independent of XMPsdk and extracts Extended XMP.

Normally there is only a single APP1/xmp segment in a JPG.

549 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pS test/data/exiv2-bug1026.jpg
STRUCTURE OF JPEG FILE: test/data/exiv2-bug1026.jpg
 address | marker       |  length | data
       0 | 0xffd8 SOI  
       2 | 0xffe0 APP0  |      16 | JFIF.....,.,....
      20 | 0xffe1 APP1  |   12807 | Exif..MM.*......................
   12829 | 0xffe1 APP1  |    3448 | http://ns.adobe.com/xap/1.0/.<?x
   16279 | 0xffdb DQT   |      67 
   16348 | 0xffdb DQT   |      67 
   16417 | 0xffc2 SOF2  |      17 
   16436 | 0xffc4 DHT   |      28 
   16466 | 0xffc4 DHT   |      26 
   16494 | 0xffda SOS  
550 rmills@rmillsmbp:~/gnu/exiv2/trunk $  
With "Extended XMP", the signature of the additional segments is different:
STRUCTURE OF JPEG FILE: exiv2-bug922.jpg
 address | marker       |  length | data
       0 | 0xffd8 SOI  
       2 | 0xffe1 APP1  |     911 | Exif..MM.*.......%.........#....
     915 | 0xffe1 APP1  |     870 | http://ns.adobe.com/xap/1.0/.<x:
    1787 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
   67249 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  132711 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  198173 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  263635 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  329097 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  394559 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  460021 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  525483 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  590945 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  656407 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  721869 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  787331 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  852793 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  918255 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
  983717 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1049179 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1114641 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1180103 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1245565 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1311027 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1376489 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1441951 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1507413 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1572875 | 0xffe1 APP1  |   65460 | http://ns.adobe.com/xmp/extensio
 1638337 | 0xffe1 APP1  |   42907 | http://ns.adobe.com/xmp/extensio
 1681246 | 0xffe0 APP0  |      16 | JFIF............
 1681264 | 0xffdb DQT   |      67 
 1681333 | 0xffdb DQT   |      67 
 1681402 | 0xffc0 SOF0  |      17 
 1681421 | 0xffc4 DHT   |      31 
 1681454 | 0xffc4 DHT   |     181 
 1681637 | 0xffc4 DHT   |      31 
 1681670 | 0xffc4 DHT   |     181 
 1681853 | 0xffda SOS  
562 rmills@rmillsmbp:~/gnu/exiv2/trunk/test/data $ 
This file contains GImage data. Rather a lot of it.
563 rmills@rmillsmbp:~/gnu/exiv2/trunk/test/data $ exiv2 -pX exiv2-bug922.jpg | xmllint --pretty 2 - | wc
     19      23 1677470
564 rmills@rmillsmbp:~/gnu/exiv2/trunk/test/data $ 

I'm guessing that you have a file that has multiple APP1/xap segments. I'm not sure that is permitted. Perhaps you could attach a sample file that I can examine.

RE: Multiple XMP sections - Added by Goran Nagy about 5 years ago

Thanks for reply. Here is one of the images.

RE: Multiple XMP sections - Added by Robin Mills about 5 years ago

Thank you for sending a sample image. Now we're on the same page. I don't believe this is a legal file. To answer your original question Can I get all XMP sections with Exiv2, the answer is No. However, I have been able to extract the XMP with a little bit of cunning.

I've checked both the 2010 and 2016 editions of the Adobe XMP Embedding Spec. The 2010 edition talks about "The XMP/App1 segment". It doesn't explicitly rule out the possibility of more that one, however it always speaks of it in a singular way. The 2016 edition mentions that multiple APP1 segments may follow each other to any data (eg Exif) to exceed 64k. It says that consecutive segments should have a length of exactly 64k.

Turning our attention to your file:

625 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pS http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg
STRUCTURE OF JPEG FILE: http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg
 address | marker       |  length | data
       0 | 0xffd8 SOI  
       2 | 0xffe0 APP0  |      16 | JFIF............
      20 | 0xffdb DQT   |      67 
      89 | 0xffdb DQT   |      67 
     158 | 0xffc0 SOF0  |      17 
     177 | 0xffc4 DHT   |      28 
     207 | 0xffc4 DHT   |      74 
     283 | 0xffc4 DHT   |      28 
     313 | 0xffc4 DHT   |      57 
     372 | 0xffe1 APP1  |    4613 | http://ns.adobe.com/xap/1.0/.<?x
    4987 | 0xffe1 APP1  |    4602 | http://ns.adobe.com/xap/1.0/.<?x
    9591 | 0xffda SOS  
626 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pa http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg
Xmp.dc.subject                               XmpBag      1  Celebs: Jenna Marbles
Xmp.xmpMM.InstanceID                         XmpText    41  uuid:e61b1e9d-b8e4-6a17-a443-74b51849baaf
627 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pX http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg | xmllint --format -
-:57: parser error : Extra content at the end of the document
<?xpacket end='w'?><?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?><x:xmpme
                                                                        ^
628 rmills@rmillsmbp:~/gnu/exiv2/trunk $ 
The XMP extracted by -pX is garbled!

However, don't be discouraged. There is something that can be done with this. The output from -pS reveals the location of the XMP/xml in the image, and the utility dd can be used to extract it.

644 rmills@rmillsmbp:~/gnu/exiv2/trunk $ curl http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg > girl.jpg
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 47323    0 47323    0     0   9208      0 --:--:--  0:00:05 --:--:--  350k
645 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pS girl.jpg 
STRUCTURE OF JPEG FILE: girl.jpg
 address | marker       |  length | data
       0 | 0xffd8 SOI  
       2 | 0xffe0 APP0  |      16 | JFIF............
      20 | 0xffdb DQT   |      67 
      89 | 0xffdb DQT   |      67 
     158 | 0xffc0 SOF0  |      17 
     177 | 0xffc4 DHT   |      28 
     207 | 0xffc4 DHT   |      74 
     283 | 0xffc4 DHT   |      28 
     313 | 0xffc4 DHT   |      57 
     372 | 0xffe1 APP1  |    4613 | http://ns.adobe.com/xap/1.0/.<?x
    4987 | 0xffe1 APP1  |    4602 | http://ns.adobe.com/xap/1.0/.<?x
    9591 | 0xffda SOS  
646 rmills@rmillsmbp:~/gnu/exiv2/trunk $ dd bs=1 skip=$((372+31+2)) count=$((4613-31)) if=girl.jpg | xmllint --pretty 1 -
4582+0 records in
4582+0 records out
4582 bytes (4.6 kB) copied, 0.013842 s, 331 kB/s
<?xml version="1.0"?>
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="uuid:e61b1e9d-b8e4-6a17-a443-74b51849baaf">
      <dc:subject>
        <rdf:Bag>
          <rdf:li>Celebs: Jenna Marbles</rdf:li>
        </rdf:Bag>
      </dc:subject>
    </rdf:Description>
  </rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>
647 rmills@rmillsmbp:~/gnu/exiv2/trunk $ dd bs=1 skip=$((4987+31+2)) count=$((4602-31)) if=girl.jpg | xmllint --pretty 1 -
4571+0 records in
4571+0 records out
4571 bytes (4.6 kB) copied, 0.010801 s, 423 kB/s
<?xml version="1.0"?>
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="uuid:18afef1a-58fe-1d49-12ff-2834cf32b7e6">
      <dc:subject>
        <rdf:Bag>
          <rdf:li>Celeb: Jenna Marbles</rdf:li>
        </rdf:Bag>
      </dc:subject>
    </rdf:Description>
  </rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>
648 rmills@rmillsmbp:~/gnu/exiv2/trunk $ 
I've also dumped the file girl.jpg with ReadingXMP which is part of the 2016 XMPsdk from Adobe.
662 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ samples/target/macintosh/intel_64/Release/ReadingXMP ~/gnu/exiv2/trunk/girl.jpg 

/Users/rmills/gnu/exiv2/trunk/girl.jpg is opened successfully
dc:subject[1] = Celeb: Jenna Marbles
dc:title in English = 
dc:title in French = 

XMP dumped to XMPDump.txt
663 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ 
To answer your question "Do we read the first or last APP1/xap segment?", I edited the XMP with a Hex Editor (change the first Jenna -> JEnna):
663 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ samples/target/macintosh/intel_64/Release/ReadingXMP ~/gnu/exiv2/trunk/girl.jpg 

/Users/rmills/gnu/exiv2/trunk/girl.jpg is opened successfully
dc:subject[1] = Celeb: Jenna Marbles
dc:title in English = 
dc:title in French = 

XMP dumped to XMPDump.txt
664 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ exiv2 -pa ~/gnu/exiv2/trunk/girl.jpg 
Xmp.dc.subject                               XmpBag      1  Celebs: JEnna Marbles
Xmp.xmpMM.InstanceID                         XmpText    41  uuid:e61b1e9d-b8e4-6a17-a443-74b51849baaf
665 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ 
ReadingXMP reads the second APP1/xap segment. exiv2 reads the first APP1/xap segment.

RE: Multiple XMP sections - Added by Robin Mills about 5 years ago

I've logged a bug #1229 about the 'garbled' XMP being returned by -pX. It's not really 'garbled', exiv2 is reporting all the APP1/xap segments consecutively. This isn't valid XML. For consistency between -pX and -pa, I only report the first segment. r4539

I've taken the liberty to copy the file 'girl.jpg' file to test/data/exiv2-bug1229.jpg and modified the test suite appropriately.

RE: Multiple XMP sections - Added by Goran Nagy about 5 years ago

Thank you very much for your detailed response. Much appreciated.

    (1-5/5)