Project

General

Profile

IptcData not utf8 ?

Added by G K about 9 years ago

Hello,
how do i get the IptcData always as utf8 string?
Thanks :)


Replies (5)

RE: IptcData not utf8 ? - Added by Robin Mills about 9 years ago

G K

I'd like to help you, however you'll have to explain more about your expectations. We had a discussion recently with Shawn who is one of our contributors from China. http://dev.exiv2.org/issues/848 May I ask you to read that thread, please?

If you have specific test input/output and/or image files for which you believe exiv2 is not performing correctly/adequately, please document your concerns and they will be investigated.

Robin

RE: IptcData not utf8 ? - Added by G K about 9 years ago

Well i do not know if this is wrong behavior of exiv2, i rather think not. Somewhere i read that iptc data is always read using the encoding that it is saved with or something, but not by default utf8.
For example the image xmp.jpg. It has iptc data written in some unknown character set. If i read the iptc data

Exiv2::IptcData &iptcData = image->iptcData();
Exiv2::IptcData::const_iterator itcpEnd = iptcData.end();
or (Exiv2::IptcData::const_iterator i = iptcData.begin(); i != itcpEnd; ++i) {
std::string value = i->value().toString();
.
.

Then "value" - for iptc data in this image - is only valid utf8 if the value does not contain special characters.
Therefore i can not blindly pass the value to another function accepting only valid utf8.

blob (157 KB) blob

RE: IptcData not utf8 ? - Added by G K about 9 years ago

Well i do not know if this is wrong behavior of exiv2, i rather think not. Somewhere i read that iptc data is always read using the encoding that it is saved with or something, but not by default utf8.
For example the image xmp.jpg. It has iptc data written in some unknown character set. If i read the iptc data

Exiv2::IptcData &iptcData = image->iptcData();
Exiv2::IptcData::const_iterator itcpEnd = iptcData.end();
or (Exiv2::IptcData::const_iterator i = iptcData.begin(); i != itcpEnd; ++i) {
std::string value = i->value().toString();
.
.

Then "value" - for iptc data in this image - is only valid utf8 if the value does not contain special characters.
Therefore i can not blindly pass the value to another function accepting only valid utf8.

blob (157 KB) blob

RE: IptcData not utf8 ? - Added by Robin Mills about 9 years ago

G K

Thanks for updating this. And thank you for the code in topic 1289: "Exif deletes too many tags??". I'll have a look at both of those matters on Sunday afternoon (2012-11-04).

Robin

RE: IptcData not utf8 ? - Added by Robin Mills about 9 years ago

I'd like to help you, however I don't know much about this. As I explained to Shawn, being Scottish (and a Native english speaker), I'm rather challenged by non-ascii character sets.

Last weekend, a photo IMG_2257.jpg was taken and I added a Caption in Picasa. You can see that Picasa added the tag "Iptc.Application2.Caption".

C:\Users\rmills\Desktop>exiv2 -pi IMG_2257.jpg
Iptc.Envelope.ModelVersion                   Short       1  4
Iptc.Envelope.CharacterSet                   String      3  ←%G
Iptc.Application2.RecordVersion              Short       1  4
Iptc.Application2.Caption                    String     24  Robin and Jim Recovering

I've scaled the image, and copied the Iptc data with the metacopy sample application.

C:\Users\rmills\Desktop>pscale
Syntax PScale <infilename> <outfilename> <scale>

C:\Users\rmills\Desktop>pscale IMG_2257.jpg robin.jpg 0.1

C:\Users\rmills\Desktop>\gnu.test\exiv2\msvc\bin\Release\Metacopy
Metacopy: Read and write files must be specified

Reads and writes raw metadata. Use -h option for help.
Usage: Metacopy [-iecaph] readfile writefile

C:\Users\rmills\Desktop>\gnu.test\exiv2\msvc\bin\Release\Metacopy -i IMG_2257.jpg robin.jpg

C:\Users\rmills\Desktop>exiv2 -pi robin.jpg
Iptc.Envelope.ModelVersion                   Short       1  4
Iptc.Envelope.CharacterSet                   String      3  ←%G
Iptc.Application2.RecordVersion              Short       1  4
Iptc.Application2.Caption                    String     24  Robin and Jim Recovering

The IPTC data is a dictionary of key value pairs. In the case of the Caption, it has been added a string of 24 ascii bytes.

0        1         1
12345678901234567890123456789
Robin and Jim Recovering

However from the earlier thread with Shawn in China, I thought we concluded that although labelled "String", Iptc.Application2.Caption is 24 binary bytes. The bytes could be ascii/UTF-8 or any other character set. So you can call toString() and hope for the best, however if the bytes are encoded in some way, you'll be out of luck. You'll need to do the C++/library equivalent of running iconv:
577 rmills@rmills-laptop:~/Desktop $ iconv -?
Usage: iconv [-c] [-s] [-f fromcode] [-t tocode] [file ...]
or:    iconv -l
Try `iconv --help' for more information.
578 rmills@rmills-laptop:~/Desktop $

I hope the photo of me in my kilt cooling off after running 10k will please you - even although I can't give you a solid answer to your question.

Robin

robin.jpg (151 KB) robin.jpg
    (1-5/5)