Project

General

Profile

Textcoding UCS2 Exif.Image.XP

Added by Ge Bu over 11 years ago

Keys Exif.Image.XP (Author, Keywords, ...) use UCS2 encoding of text. Is it solved in exiv2? Is there any switch to set it? Is it possible non-ascii chars š, ž, ш, ч, ...?


Replies (8)

RE: Textcoding UCS2 Exif.Image.XP - Added by Robin Mills over 11 years ago

Ge

I believe exiv2 will handle Unicode. However I'm going to leave Andreas to answer about this (and your other question). I'm the windows (and mac) build engineer for the project. Andreas lives in South East Asia and I'm sure he'll give you a reply when he gets out of bed.

RE: Textcoding UCS2 Exif.Image.XP - Added by Ge Bu over 11 years ago

I think, that exiv2 can correctly read UCS 2 in these keys, because I can see (display, print) what I set in Windows by Windows explorer. But I´m not able to correctly write.

RE: Textcoding UCS2 Exif.Image.XP - Added by Andreas Huggel over 11 years ago

Ge Bu wrote:

I think, that exiv2 can correctly read UCS 2 in these keys, because I can see (display, print) what I set in Windows by Windows explorer. But I´m not able to correctly write.

That's exactly how it is. There is a pretty-print function to convert the UCS2 text in these fields to Unicode. But there is no help when it comes to writing the fields with the command-line exiv2 tool. Since the default type is Byte, it expects a series of numbers in the set command, which is admittedly not very user-friendly. You could of course write your own program and convert the text to USC2 before setting these tag values. I can point you to some existing library code that may serve as an example if you like.

Andreas

RE: Textcoding UCS2 Exif.Image.XP - Added by Ge Bu over 11 years ago

Thanks for answer, but I think I´m not able to code it :(.
It is possible to write these tags by Windows Explorer.
I tried exiv2 because it can manage tags (Exif, IPTC, XMP) in one file and apply to all pictures. I use only author and keywords. Many other tools are too complicated.
Now I found Zoner Media Explorer where I can manage these things too throug one user-friedly dialogue. I will use it. But this software clean Exif.Image.XP tags :(. It works under Linux through wine.

RE: Textcoding UCS2 Exif.Image.XP - Added by Steve Wright over 11 years ago

As I noted in my reply in Ge Bu's other thread on this topic, the UCS2 Windows wants is actually composed of the corresponding ASCII code numbers for each character in your string, interspersed with zeros and spaces, and trailed-off by three zeros (which tells Windows where the end of the string is, I presume). Believe me, I Google'd under every rock and around every tree in the Unicode forest before it occurred to me to start from the other end and write something into a file using Explorer, then parse the unformatted string using Exiv2 on the command line, and try to match the numbers with some known code pattern. It turned out it was ASCII. But ASCII without spaces produced garbage text in the Properties>Summary window for whichever XP tag you care to name, so I went back to my Exiv2 output and noticed there were zeros and spaces in between the ascii numbers XP had produced. I re-formatted a string to suit, pasted it into an exiv2 M"set Exif.Image.XPComment" command, and it worked - no garbage text and the full string in plain alphabetical English (or Latin1 if you prefer).

The rest of my investigations are in the other thread.

Steve Wright

Attached you will find, as a text file in a .zip archive, the still-buggy BASH script I mentioned in passing in the other thread.

RE: Textcoding UCS2 Exif.Image.XP - Added by Robin Mills over 11 years ago

Steve

Thanks for doing this detective work. I must admit to 2 thoughts - which totally contradict each other. One is skepticism at what you've said and the other (like most native English speakers) is a phobia for almost anything to do with Unicode! Maybe one day I'll overcome my fears, read you script and do a little bit of googling and thinking - and then maybe this will all make sense. Until then, however I'll take you word for this and thank you for taking the time to come back and report your findings.

Thank you very much.

Robin

RE: Textcoding UCS2 Exif.Image.XP - Added by Steve Wright over 11 years ago

A sharp eye often catches things it can't explain or adequately describe.

Robin: maybe you can tell me how it is that the C++ library files for iconv that Andreas worked into the tag-parsing code in Exiv2 have CS1259 text encoding in them, while the executable for BASH and other shells definitely doesn't. This is one of the things that reminds me of how much time I wasted learning CLI BASIC and AppleScript in Classic Mac (the former on the Apple IIe's at my high-school as well) instead of C+ at the very least.

If the binary could be re-compiled to include CS1259, which every indication tells me is the encoding M$ uses for storing these "XP tags" in files, dead-on and full-stop, then a script to "translate" plain text to the ASCII codes needed would, likely as not as I see it, need to involve the specific and peculiar formatting that Windows requires when writing these tags via Exiv2.

Evidently Mr Harvey's "competitive" exif/iptc editor, ExifTool, does the "translation" on the fly. I'd like to think it an oversight on Andreas' part to wrap one into his project, but if he had, then this thread may never have been necessary. It's nice to find things out about folks, however it might happen, even if the inspiration is a shared head-scratcher like the "Byte" field type for the XP tags.

Steve Wright

RE: Textcoding UCS2 Exif.Image.XP - Added by Robin Mills over 11 years ago

Goodness, I don't know - I've already confessed my phobia for Unicode. However, you've thrown down the gauntlet and I think I'll do a little digging about on this. I'm off running this weekend - so it's likely to be one night next week after work. I've often wondered "how do you use Unicode from bash/terminal?" It's time for me to figure that out. A trip to LinuxQuestions and forums.macosxhints.com seems in order.

Whatever you were doing with Macs at school I'm sure has put you on the road to somewhere. I believe Classic Mac (OS 7 and earlier) to be almost as good as Windows 3.0. I started using the Mac in 1998 at 8.1 and always called in MacOS3.1. Snow Leopard (MacOSX 10.6) is totally awesome - especially the 64 bit, rock-solid Unix. I also use Windows 7 and Ubuntu 10.4 every day - however my desktop at home and in the office is a Mac.

Naturally, I deny all knowledge of Phil Harvey and would never confess to occasionally using ExifTool. He's done a good job - and so of course has our friend Andreas. Remarkably, I've hardly investigated Andreas' code - which is beautifully laid out and documented. My contribution to this project has been to provide the MSVC build environment - and to answer questions on the forum.

If you'd like to talk UNICODE/UCS/iconv stuff in detail, let's take the conversation off-line and I'll report a summary back here later. Discussing my lack of knowledge of unicode is something I'd prefer to do privately!

    (1-8/8)