Multiple XMP sections
Added by Goran Nagy about 5 years ago
I was wondering if I can get all XMP sections with Exiv2.
Using:
image->readMetadata(); Exiv2::XmpData &xmpData = image->xmpData();
gives me only one XMP section (not sure if it is first or last). Can I somehow get all of them?
I am not talking about one section spread over multiple APP1, but some images have several small XMP sections.
Replies (5)
RE: Multiple XMP sections - Added by Robin Mills about 5 years ago
I'm not sure that I understand what you mean by "one section". The version of XMPsdk embedded in Exiv2 does not support "Extended" XMP. You can dump the structure of the JPG with the command: $ exiv2 -pS path-to-image You can extract the XMP with the command: $ exiv2 -pX path-to-image > foo.xml The code for -pX is independent of XMPsdk and extracts Extended XMP.
Normally there is only a single APP1/xmp segment in a JPG.
549 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pS test/data/exiv2-bug1026.jpg
STRUCTURE OF JPEG FILE: test/data/exiv2-bug1026.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 16 | JFIF.....,.,....
20 | 0xffe1 APP1 | 12807 | Exif..MM.*......................
12829 | 0xffe1 APP1 | 3448 | http://ns.adobe.com/xap/1.0/.<?x
16279 | 0xffdb DQT | 67
16348 | 0xffdb DQT | 67
16417 | 0xffc2 SOF2 | 17
16436 | 0xffc4 DHT | 28
16466 | 0xffc4 DHT | 26
16494 | 0xffda SOS
550 rmills@rmillsmbp:~/gnu/exiv2/trunk $ With "Extended XMP", the signature of the additional segments is different:STRUCTURE OF JPEG FILE: exiv2-bug922.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe1 APP1 | 911 | Exif..MM.*.......%.........#....
915 | 0xffe1 APP1 | 870 | http://ns.adobe.com/xap/1.0/.<x:
1787 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
67249 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
132711 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
198173 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
263635 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
329097 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
394559 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
460021 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
525483 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
590945 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
656407 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
721869 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
787331 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
852793 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
918255 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
983717 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1049179 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1114641 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1180103 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1245565 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1311027 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1376489 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1441951 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1507413 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1572875 | 0xffe1 APP1 | 65460 | http://ns.adobe.com/xmp/extensio
1638337 | 0xffe1 APP1 | 42907 | http://ns.adobe.com/xmp/extensio
1681246 | 0xffe0 APP0 | 16 | JFIF............
1681264 | 0xffdb DQT | 67
1681333 | 0xffdb DQT | 67
1681402 | 0xffc0 SOF0 | 17
1681421 | 0xffc4 DHT | 31
1681454 | 0xffc4 DHT | 181
1681637 | 0xffc4 DHT | 31
1681670 | 0xffc4 DHT | 181
1681853 | 0xffda SOS
562 rmills@rmillsmbp:~/gnu/exiv2/trunk/test/data $ This file contains GImage data. Rather a lot of it.563 rmills@rmillsmbp:~/gnu/exiv2/trunk/test/data $ exiv2 -pX exiv2-bug922.jpg | xmllint --pretty 2 - | wc
19 23 1677470
564 rmills@rmillsmbp:~/gnu/exiv2/trunk/test/data $
I'm guessing that you have a file that has multiple APP1/xap segments. I'm not sure that is permitted. Perhaps you could attach a sample file that I can examine.
RE: Multiple XMP sections - Added by Goran Nagy about 5 years ago
Thanks for reply. Here is one of the images.
RE: Multiple XMP sections - Added by Robin Mills about 5 years ago
Thank you for sending a sample image. Now we're on the same page. I don't believe this is a legal file. To answer your original question Can I get all XMP sections with Exiv2, the answer is No. However, I have been able to extract the XMP with a little bit of cunning.
I've checked both the 2010 and 2016 editions of the Adobe XMP Embedding Spec. The 2010 edition talks about "The XMP/App1 segment". It doesn't explicitly rule out the possibility of more that one, however it always speaks of it in a singular way. The 2016 edition mentions that multiple APP1 segments may follow each other to any data (eg Exif) to exceed 64k. It says that consecutive segments should have a length of exactly 64k.
Turning our attention to your file:
625 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pS http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg
STRUCTURE OF JPEG FILE: http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 16 | JFIF............
20 | 0xffdb DQT | 67
89 | 0xffdb DQT | 67
158 | 0xffc0 SOF0 | 17
177 | 0xffc4 DHT | 28
207 | 0xffc4 DHT | 74
283 | 0xffc4 DHT | 28
313 | 0xffc4 DHT | 57
372 | 0xffe1 APP1 | 4613 | http://ns.adobe.com/xap/1.0/.<?x
4987 | 0xffe1 APP1 | 4602 | http://ns.adobe.com/xap/1.0/.<?x
9591 | 0xffda SOS
626 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pa http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg
Xmp.dc.subject XmpBag 1 Celebs: Jenna Marbles
Xmp.xmpMM.InstanceID XmpText 41 uuid:e61b1e9d-b8e4-6a17-a443-74b51849baaf
627 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pX http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg | xmllint --format -
-:57: parser error : Extra content at the end of the document
<?xpacket end='w'?><?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?><x:xmpme
^
628 rmills@rmillsmbp:~/gnu/exiv2/trunk $ The XMP extracted by -pX is garbled!
However, don't be discouraged. There is something that can be done with this. The output from -pS reveals the location of the XMP/xml in the image, and the utility dd can be used to extract it.
644 rmills@rmillsmbp:~/gnu/exiv2/trunk $ curl http://dev.exiv2.org/attachments/download/1070/64ofainklkbe8q8fphlza9613.jpg > girl.jpg
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 47323 0 47323 0 0 9208 0 --:--:-- 0:00:05 --:--:-- 350k
645 rmills@rmillsmbp:~/gnu/exiv2/trunk $ exiv2 -pS girl.jpg
STRUCTURE OF JPEG FILE: girl.jpg
address | marker | length | data
0 | 0xffd8 SOI
2 | 0xffe0 APP0 | 16 | JFIF............
20 | 0xffdb DQT | 67
89 | 0xffdb DQT | 67
158 | 0xffc0 SOF0 | 17
177 | 0xffc4 DHT | 28
207 | 0xffc4 DHT | 74
283 | 0xffc4 DHT | 28
313 | 0xffc4 DHT | 57
372 | 0xffe1 APP1 | 4613 | http://ns.adobe.com/xap/1.0/.<?x
4987 | 0xffe1 APP1 | 4602 | http://ns.adobe.com/xap/1.0/.<?x
9591 | 0xffda SOS
646 rmills@rmillsmbp:~/gnu/exiv2/trunk $ dd bs=1 skip=$((372+31+2)) count=$((4613-31)) if=girl.jpg | xmllint --pretty 1 -
4582+0 records in
4582+0 records out
4582 bytes (4.6 kB) copied, 0.013842 s, 331 kB/s
<?xml version="1.0"?>
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="uuid:e61b1e9d-b8e4-6a17-a443-74b51849baaf">
<dc:subject>
<rdf:Bag>
<rdf:li>Celebs: Jenna Marbles</rdf:li>
</rdf:Bag>
</dc:subject>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>
647 rmills@rmillsmbp:~/gnu/exiv2/trunk $ dd bs=1 skip=$((4987+31+2)) count=$((4602-31)) if=girl.jpg | xmllint --pretty 1 -
4571+0 records in
4571+0 records out
4571 bytes (4.6 kB) copied, 0.010801 s, 423 kB/s
<?xml version="1.0"?>
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="uuid:18afef1a-58fe-1d49-12ff-2834cf32b7e6">
<dc:subject>
<rdf:Bag>
<rdf:li>Celeb: Jenna Marbles</rdf:li>
</rdf:Bag>
</dc:subject>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>
648 rmills@rmillsmbp:~/gnu/exiv2/trunk $ I've also dumped the file girl.jpg with ReadingXMP which is part of the 2016 XMPsdk from Adobe. 662 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ samples/target/macintosh/intel_64/Release/ReadingXMP ~/gnu/exiv2/trunk/girl.jpg /Users/rmills/gnu/exiv2/trunk/girl.jpg is opened successfully dc:subject[1] = Celeb: Jenna Marbles dc:title in English = dc:title in French = XMP dumped to XMPDump.txt 663 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $To answer your question "Do we read the first or last APP1/xap segment?", I edited the XMP with a Hex Editor (change the first Jenna -> JEnna):
663 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ samples/target/macintosh/intel_64/Release/ReadingXMP ~/gnu/exiv2/trunk/girl.jpg /Users/rmills/gnu/exiv2/trunk/girl.jpg is opened successfully dc:subject[1] = Celeb: Jenna Marbles dc:title in English = dc:title in French = XMP dumped to XMPDump.txt 664 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ exiv2 -pa ~/gnu/exiv2/trunk/girl.jpg Xmp.dc.subject XmpBag 1 Celebs: JEnna Marbles Xmp.xmpMM.InstanceID XmpText 41 uuid:e61b1e9d-b8e4-6a17-a443-74b51849baaf 665 rmills@rmillsmbp:~/gnu/xmpsdk/XMP-Toolkit-SDK-CC201607 $ReadingXMP reads the second APP1/xap segment. exiv2 reads the first APP1/xap segment.
RE: Multiple XMP sections - Added by Robin Mills about 5 years ago
I've logged a bug #1229 about the 'garbled' XMP being returned by -pX. It's not really 'garbled', exiv2 is reporting all the APP1/xap segments consecutively. This isn't valid XML. For consistency between -pX and -pa, I only report the first segment. r4539
I've taken the liberty to copy the file 'girl.jpg' file to test/data/exiv2-bug1229.jpg and modified the test suite appropriately.
RE: Multiple XMP sections - Added by Goran Nagy about 5 years ago
Thank you very much for your detailed response. Much appreciated.