keywords added by exiv2 not found on getty image
I am using exiv2 v 0.23 on Ubuntu.
I export images using Lightroom, with keywords in the IPTC metadata.
I then use exiv2 to add keywords.
I then upload my images to an online website, http://www.gettyimages.fr/ . The website sees the keywords added by Lightroom, but not the keywords added by exiv2. I can see the keywords added by exiv2 in the windows or linux file explorer, so I know they have been properly embedded in the image.
Another related problem : if I change the Iptc.Application2.Headline field, the old headline is still retrieved by gettyimages, as if it is still stored somewhere ! (although again, I can see the change myself with the command exiv2 -P I).
What might be causing this website to be "blind" to the IPTC changes made to the files?
Thanks for getting in touch. I cannot give a simple answer as you're discussing Exiv2, Lightroom, gettyimages.fr, windows and linux at the same time.
From your description, the culprit sounds like gettyimages. In the case of Iptc.Application2.Headline it sounds as though gettyimages caching the metadata and you are retrieving "stale" data.
I think you'll have a investigate in a step-by-step manner. For example, copy an image to A.jpg B.jpg C.jpg and put different data into Iptc.Application2.Headline for each image before uploading them. I'm not saying the gettyimages are to blame - the truth is I don't know what's wrong. If you'd like to send me a couple of your images and more explanation, I'll look at your files to see there is anything unusual about the metadata.
Thanks for the reply Robin.
I also think gettyimages is the culprit as I upload to many other sites without problem, but it is unlikely they will change their website so maybe there's a trick I can apply on my side. And some behavior is quite surprising: where would this stale data be stored, as I delete it with exiv2 first?
Anyway, let's forget about this Headline problem for now, the more urgent problem is with the keywords.
Please find attached a file that was exported with the keywords "keyword1" and "keyword2" in Lightroom. I then added "keyword3" with exiv2.
When I upload this image, gettyimage only sees keyword1 and keyword2.
test.jpg (334 KB)
Thanks for your test file. I don't see anything odd. There's a debugging feature in the latest Exiv2:
531 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2 $ exiv2 -pS ~/Downloads/test.jpg STRUCTURE OF JPEG FILE: /Users/rmills/Downloads/test.jpg address | marker | length | data 0 | 0xffd8 SOI 2 | 0xffe0 APP0 | 16 | JFIF.....H.H.... 20 | 0xffe1 APP1 | 12412 | Exif..MM.*.................z.... 12434 | 0xffe1 APP1 | 8453 | http://ns.adobe.com/xap/1.0/.<?x 20889 | 0xffed APP13 | 16418 | Photoshop 3.0.8BIM.........H.... <----- Here's your IPTC data 37309 | 0xffe2 APP2 | 3160 | ICC_PROFILE......HLino....mntrRG chunk 1/1 40471 | 0xffdb DQT | 67 40540 | 0xffdb DQT | 67 40609 | 0xffc0 SOF0 | 17 40628 | 0xffc4 DHT | 31 40661 | 0xffc4 DHT | 181 40844 | 0xffc4 DHT | 31 40877 | 0xffc4 DHT | 181 41060 | 0xffda SOS 532 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2 $And I can drill down into that with:
20889 | 0xffed APP13 | 16418 | Photoshop 3.0.8BIM.........H.... Record | DataSet | Name | Length | Data 1 | 90 | CharacterSet | 3 | .%G 2 | 0 | RecordVersion | 2 | .. 2 | 55 | DateCreated | 8 | 20171009 2 | 60 | TimeCreated | 11 | 180508+0000 2 | 62 | DigitizationDate | 8 | 20171009 2 | 63 | DigitizationTime | 11 | 180508+0000 2 | 105 | Headline | 15 | Test Headline 1 2 | 25 | Keywords | 8 | keyword1 2 | 25 | Keywords | 8 | keyword2 2 | 25 | Keywords | 8 | keyword3Everything looks fine. I can't see any reason why gettyimages would see the Lightroom metadata and ignore "keyword3". However there will be an explanation. Perhaps gettyimages have a limit of 2 keywords! That is unlikely, however it has to be something!
About your question about the "stale data". I suspect it's on the gettyimages server. When you upload a file, they extract the metadata and store it somewhere. Neither you, nor exiv2, can change that. Perhaps you can delete the image on gettyimages and upload it again. I agree with you that it's unlikely that gettyimages will change their code/web-site for you. However if you can raise a ticket on them (and reference this thread), I'll be a happy to work with them if they wish to discuss this.
I've already opened a ticket, I'll keep you updated.
It's not a matter of number of keywords, I've tried with different numbers.
About the stale data, it's even weirder than you think. A file that gettyimage has NEVER seen (and so can't cache data about it), exported with the headline "first" in Lightroom, headline that has been erased and then set to "second" in exiv2... gettyimage will still report "first" as the headline!
I attach this file, can you find the headline "first" stored anywhere in it? This might tell us where gettyimage gets its metadata from (maybe not IPTC, but then where?).
DSC05195 (Medium).jpg (309 KB)
The "first" is in the Xmp metadata (probably inserted by lightroom).
534 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2 $ exiv2 -pa ~/Downloads/DSC05195\ \(Medium\).jpg | grep -i first Xmp.photoshop.Headline XmpText 5 first 535 rmills@rmillsmbp:~/gnu/github/exiv2/exiv2 $
ok so that was the problem. gettyimage is reading metadata from xmp and not iptc. They are the only ones doing that which got me confused.
That's good news, thanks for letting me know. If you discover something else about this, update this topic and I'll respond. Thank You for using Exiv2.