Bug #848
commets and copyright is output as ascii, but it always write in UTF-8 format
0%
Description
although it works fine with English, But it's truely a problem with other languages.
History
Updated by Robin Mills about 9 years ago
Shawn
I'm Scottish and a native English speaker. Working with UTF-8 and other character sets is a mystery to me. I tried this on Ubuntu 12.04 (bash 4.2.24):
$ echo $'>\xE2\x98\xA0<' >☠< $ exiv2 -M$'set Exif.Photo.UserComment >\xE2\x98\xA0<' ~/R.jpg $ exiv2 -pa ~/R.jpg | grep Comment Exif.Photo.UserComment Undefined 13 >☠<
Doing the same and filtering the output with od -h (to avoid browser UTF issues)
$ echo -n $'>\xE2\x98\xA0<' | od -h 0000000 e23e a098 003c 0000005 $ exiv2 -M$'set Exif.Photo.UserComment >\xE2\x98\xA0<' ~/R.jpg ; exiv2 -pa -g Comm ~/R.jpg | od -h 0000000 7845 6669 502e 6f68 6f74 552e 6573 4372 0000020 6d6f 656d 746e 2020 2020 2020 2020 2020 0000040 2020 2020 2020 2020 2020 2020 4120 6373 0000060 6969 2020 2020 2020 3620 2020 e23e a098 0000100 0a3c 0000102
I'm using the version of exiv2 in the trunk in which the option -g specifies any substring of the name of a tag.
It certainly looks as though the UserComment is simply binary and will store almost anything you give him.
Looking in the man page, I see the following example:
exiv2 -M"set Exif.Photo.UserComment charset=Ascii New Exif comment" image.jpg Sets the Exif comment to an ASCII string.
It appears the you can use alternative character sets if you wish, although I can't explain how to use this feature. I know you're a very good engineer and perhaps you can read the code and tell us all!
Robin
Updated by Andreas Huggel about 9 years ago
I don't understand the problem description. Shawn, can you please elaborate, preferably with a small program / sample use of the exiv2 command line tool what you're doing in detail and what is going wrong?
Updated by Shawn Jean about 9 years ago
I should have gave more spec, a bit busy these days, sorry for that.
It should not be called a bug indeed. Same as Robin's test, i did it serial times. Write and Read with exiv2 turns out to be totally right.
X.Jing@XJing-PC ~/exiv2/msvc64/bin/Win32/Debug $ exiv2 -M 'set Exif.Photo.UserComment This is 中文测试' test.jpg && exiv2 -pa test.jpg | grep Comment Exif.Photo.UserComment Undefined 24 This is 中文测试 $ echo 'this is 中文测试' | od -h 0000000 6874 7369 6920 2073 d0d6 c4ce e2b2 d4ca 0000020 000a 0000021
It's all right.
But When i use other softwares like Windows Explorer to write the string in, it turn out to be
exiv2 -pa test.jpg | grep Comment Exif.Image.XPComment Byte 26 this is 涓枃娴嬭瘯
It because the Windows Explorer write the string in Unicode.
HEX:
74 00 68 00 69 00 73 00 20 00 2D 4E 87 65 4B 6D D5 8B
;-) guys, weired right? I don't know whether i made it clear. glad for any question.
Updated by Robin Mills about 9 years ago
- Category set to metadata
- Status changed from New to Resolved
- Assignee set to Robin Mills
Shawn
I think this is something to do with Windows Explorer converting the user entered string to UCS-16 (or something). You may be able to use iconv to print the string correctly:
$ exiv2 -pa test.jpg | grep Comment | iconv -f UCS16 -t UTF8
However, my old/Scottish eyes (and brain) have never understood bamboo characters!
I'm going to update this issue to "Resolved". You may reopen and/or assign it to me if you have additional information.
Robin