Slow embed times

Added by Scott Renton 11 months ago

Hi folks

I just wondered, does exiv2 open and close a file each time it embeds an item of data, or does it keep it open?

I am running the following in a .bat file, and it takes about 3 minutes. The 2 files in question are about 200Mb each, so obviously if it's opening and closing with each line, that's a huge amount of IO, but if it's only doing it once, we just have to take the hit. I couldn't find anything in the manual about that. If there's any other optimisation I could add to it, great. It may not need all the logging either.

Cheers
Scott

cd /D T: > "T:\diu\Worksheets\error\connection.txt" 2>&1
cd "T:\diu\Worksheets\exiv2" >> "T:\diu\Worksheets\error\connection.txt" 2>>&1
echo "directory is: "%cd% >> "T:\diu\Worksheets\error\connection.txt" 2>>&1
exiv2 -M "set Iptc.Application2.Headline String 5000005" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" > "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Iptc.Application2.Copyright String Digital Image: Copyright The University of Edinburgh. Original: Copyright The University of Edinburgh. Free use." "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Xmp.dc.creator XmpSeq Digital Imaging Unit" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Iptc.Application2.Caption String Collection: CRC Gallimaufry; Persons: Christie; Event: ; Place: ; Category: ; Description: " "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Iptc.Application2.ObjectName String Title: Frankly Miserable Screenshot 6; Author: Fredericks, Frankie; Page No: p.15; Shelfmark: V.1235; Date: 1948" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrCity XmpText Edinburgh" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrPcode XmpText EH8 9LJ" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrExtadr XmpText Centre for Research Collections, The University of Edinburgh, George Square" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrCtry XmpText UK" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiTelWork XmpText 0131 650 8379" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiEmailWork XmpText is-crc@ed.ac.uk" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiUrlWork XmpText http://www.lib.ed.ac.uk/resources/collections/crc/index.html" "T:\diu\Crops\5000000-5000999\Process\5000005m.tif" >> "T:\diu\Worksheets\error\5000005.txt" 2>>&1
exiv2 -M "set Iptc.Application2.Headline String 5000006" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" > "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Iptc.Application2.Copyright String Digital Image: Copyright The University of Edinburgh. Original: Copyright The University of Edinburgh. Free use." "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Xmp.dc.creator XmpSeq Digital Imaging Unit" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Iptc.Application2.Caption String Collection: CRC Gallimaufry; Persons: Christie; Event: ; Place: ; Category: ; Description: " "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Iptc.Application2.ObjectName String Title: Frankly Miserable Screenshot 7; Author: Fredericks, Frankie; Page No: p.16; Shelfmark: V.1236; Date: 1949" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrCity XmpText Edinburgh" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrPcode XmpText EH8 9LJ" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrExtadr XmpText Centre for Research Collections, The University of Edinburgh, George Square" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrCtry XmpText UK" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiTelWork XmpText 0131 650 8379" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiEmailWork XmpText is-crc@ed.ac.uk" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1
exiv2 -M "set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiUrlWork XmpText http://www.lib.ed.ac.uk/resources/collections/crc/index.html" "T:\diu\Crops\5000000-5000999\Process\5000006m.tif" >> "T:\diu\Worksheets\error\5000006.txt" 2>>&1


Replies (6)

RE: Slow embed times - Added by Robin Mills 11 months ago

Scott

Happy New Year. Long time no speak!

Here's what the exiv2(.exe) application does:

1) It parses the commands into memory
2) For each file
...a) It opens the file and reads the metadata into memory
.......b) For every command, it modifies the in-memory metadata
.......c) It writes the modified memory data to file
...d) It closes the file

You can execute several commands in a single invocation of exiv2. For example:

exiv2 -M"set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiEmailWork XmpText is-crc@ed.ac.uk" set Iptc.Application2.Copyright String Digital Image: Copyright The University of Edinburgh. Original: Copyright The University of Edinburgh. Free use." file1.tif file2.tif  

Another approach is to use a file containing the commands and process with the -mCommandFile argument For example:

> type commands.txt
set Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiEmailWork XmpText is-crc@ed.ac.uk
set Iptc.Application2.Copyright String Digital Image: Copyright The University of Edinburgh. Original: Copyright The University of Edinburgh. Free use.

> exiv2 -mcommands.txt file1.tif file2.tif

I also notice you're using the T: drive. If this is a network drive, you might get better performance by copying the 200mb Tiff to a file in local storage, processing the file and copying the result back to the network drive.

And be aware that anti-virus software could be also be causing contention issues on the disk. It would be best to process the data on a local drive on which the anti-virus software is not active.

RE: Slow embed times - Added by Scott Renton 11 months ago

Thanks Robin- I mistakenly thought you'd left the world of exiv2, so that's great that you're still around to answer these schoolboy questions. Happy new year to you too.

I'll begin by given the first one a whirl- I didn't realise it could be done. That sounds like it could massively improve things, fingers crossed!

Cheers
Scott

RE: Slow embed times - Added by Robin Mills 11 months ago

Scott

I have announced that I'm going to retire and give up ownership. I've recruited 3 wonderful new guys to help me. I'll help users, mentor team members and fix bugs. However I'm not going to do feature development and project management. Exiv2 has been about 40 hours/week of effort since I retired from work in 2014. I have other stuff on my bucket list. http://clanmills.com/Homepages/2018/plan.shtml

Two questions:
1) Which version of exiv2(.exe) are you using?
2) How large are your tiff files?

There was an issue reported on Christmas Day concerning updating large tiff files (100mb) with Exiv2 v0.26 on Windows. If you're using v0.25, you will not be impacted. https://github.com/Exiv2/exiv2/issues/200

Robin

RE: Slow embed times - Added by Robin Mills 11 months ago

Another thought about this. Is it possible to do the image processing on Linux (or a Mac)? I like the user experience on Windows. On a dual boot machine, build/compiling Exiv2 with Cygwin is 3 times slower than Linux (with the same version of GCC). MinGW is even slower. I suspect NTFS (the file system) in Windows is so slow slow slow ..........

I can't remember what you're doing for your Kiwi Photographer Lady, however you can probably do it much more quickly on a Linux box. And bash (which you can use on Linux, MacOS-X, Cygwin and MinGW) is a much nicer scripting environment than cmd.exe/.bat files. I encourage you to use Python for cross-platform scripting. One of my new team-mates has been re-writing the Exiv2 test harness in python.

RE: Slow embed times - Added by Scott Renton 11 months ago

Hi Robin

Thanks again, but I think I've got all I need. I went with the commands file option, and I believe we're running about 10-15 times faster than we were, so thanks very much for the advice. The words from my favourite New Zealander ("you're a genius") were enough for me.

Just to remind you, we currently fire off Exiv2 from a DOS window triggered by an Excel macro, because that is where they do all their processing. I had been giving them a window for each image and an exiv command for each line of data, and they frankly hated me. I think changing the looping to one window processing everything (and faster) is great, but I also think that, because each line is waiting for the previous to complete, we won't see so much of another issue we've had, of DOS windows not closing, and occasional file corruption.

Agree with you about Linux and Python though- I do everything else in such environments because it's so much faster.

Cheers again Robin- very grateful for this. We're getting a lot of heat about digitising faster, so this will be a success story to report back!

Thanks
Scott

RE: Slow embed times - Added by Robin Mills 11 months ago

Ah yes, the Excel Spreadsheet. I had forgotten about that, although I remembered there was a Kiwi Lady involved.

Thanks for the update. Happy Users, that's what we want, and, it sounds like that's what we have! Good start to 2018.

(1-6/6)

Redmine Appliance - Powered by TurnKey Linux