PErformance bei Ermittlung eines PreviewImages / EXIF Thumbnails
Added by Andreas Kolbow over 12 years ago
Hallo!
Gibt es eine Möglichkeit die Ermittlung eines EXIF Thumbnails zu beschleunigen?
Was ich mache ist folgendes:
Exiv2::PreviewPropertiesList PreviewImagesList; Exiv2::Image::AutoPtr FullImage = Exiv2::ImageFactory::open(Filename); FullImage->readMetadata(); Exiv2::PreviewManager PreviewImageManager(*FullImage); Exiv2::PreviewPropertiesList PreviewImagesList = PreviewImageManager.getPreviewProperties(); Exiv2::PreviewImage Thumbnail = PreviewImageManager.getPreviewImage(PreviewImagesList[0]); hHeapMemoryBlock = HeapAlloc(GetProcessHeap(), 0, Thumbnail.size()); memcpy(hHeapMemoryBlock, Thumbnail.pData(), Thumbnail.size());
Das dauert im Schnitt 106ms. Getestet mit 260 Bildern die auch tatsächlich ein EXIF Thumbnail enthielten.
Wenn ich das gleiche mit GDI+ mache, also einfach eine Image Instanz erzeuge und anhand der property liste die thumbnail daten in einen Speicherblock kopiere benötige ich im Schnitt nur ~2ms pro Bild um das Thumbnail zu ermitteln. (Das ist auch ca 2.5 - 3 mal schneller als die GDI+ GetThumbnail methode auf meinem Rechner benötigt, das aber nur am Rande.)
Ich habe zum Vergleich auch noch die EXIV2 1.17 ausprobiert, dort mußte ich natürlich anderen Code zum ermitteln des Thumbnails verwenden:
Exiv2::Image::AutoPtr FullImage = Exiv2::ImageFactory::open(Filename); FullImage->readMetadata(); Exiv2::ExifData &exifData = FullImage->exifData(); Exiv2::DataBuf Thumbnail = exifData.copyThumbnail(); hHeapMemoryBlock = HeapAlloc(GetProcessHeap(), 0, Thumbnail.size_); memcpy(hHeapMemoryBlock, Thumbnail.pData_, Thumbnail.size_);
Das dauert bei mir im Schnitt 38 ms pro Bild.
Hat jemand evtl. eine Idee wie man das auslesen des Thumbnails mit EXIV2 schneller gestalten könnte?
Replies (6)
RE: PErformance bei Ermittlung eines PreviewImages / EXIF Thumbnails - Added by Andreas Huggel over 12 years ago
Hi Andreas,
I'll answer in English, the post is interesting also for those who don't understand German.
PreviewImageManager.getPreviewProperties() checks for the existence of any known preview image. For that, it performs several searches on the Exif metadata. (Can you provide profiling info to verify where in your sample the time is spent?)
Instead, try with ExifThumbC::copy(), which is specialized for the Exif thumbnail. It's functionality is the same as that of the old ExifData::copyThumbnail(). The performance may still be very different, as the whole TIFF parser was replaced in 0.18, it will be interesting to see.
For the comparison with GDI+ (which I am not familiar with): the sample Exiv2 code includes reading the metadata from the file and decoding it. That would also need to be the case for comparable GDI+ code.
General questions (maybe I can run a comparable test on Linux): what images are you running this on? JPEG? If they are not JPEG, how large are they on average? Note that reading TIFF-like images on Windows is not optimized (on Linux it uses a memory-mapped file, on Windows the whole file is read. If you're familiar with memory mapping on Windows, it would probably be a small task to add this to Exiv2).
Andreas
PS: Why do you need to copy the Thumbnail again in the last two lines of both examples? DataBuf already owns a copy of the thumbnail and belongs to the caller.
RE: Performance bei Ermittlung eines PreviewImages / EXIF Thumbnails - Added by Andreas Kolbow over 12 years ago
First of all, sorry for posting my initial question in german.
Let mey explain my situation:
We code a windows application written in Delphi. In order to use EXIV2 we built a small C++ DLL. We started using exiv2 with version 0.16 where no native windows dll was present and we don't really need the full functionality so we just wrote our own dll. This is also the reason why I copy the thumbnail data onto the heap afterwards as I have to make the data persistent and pass it back to the delphi application.
About the GDI+ comparison: I am not really sure how Microsoft accomplishes this but I don't have to explicitly parse the file. I just load it and ask GDI+ about the property item containing the EXIF thumbnail data. The whole process takes about ~2ms on my computer and i am puzzeled how it can be that fast. But fact is, i can run the test mutiple times, even after a fresh reboot and it always comes out as ~2ms per file.
Unfortunately i can't add profiling info because my performance measuring routines are written in delphi and I just measure the time it takes the DLL method to return.
I just tested the following code:
FullImage = Exiv2::ImageFactory::open(lpszFilename);
FullImage->readMetadata();
Exiv2::ExifData &exifData = FullImage->exifData();
Exiv2::ExifThumbC ExifThumb(exifData);
Exiv2::DataBuf DataBuffer = ExifThumb.copy();
hHeapMemoryBlock = HeapAlloc(GetProcessHeap(), 0, DataBuffer.size_);
memcpy(hHeapMemoryBlock, DataBuffer.pData_, DataBuffer.size_);
This is taking about 84ms per Image in average on my computer. So it's faster than using the PreviewManager in 1.18 but slower than the old functionality in 1.17.
I am running this test on 260 JPEG images that were made by different digital cameras, Casio EX-Z1000, Sony DSC-T30, Nikon E3200, Canon Digital IXUS 50, Pentax Optio S, Panasonic DMC-FX9 & Canon EOS 20D. The average size is 6 Megapixel.
So far it seems, GDI+ is unbeatable in speed regarding the retrieval of exif thumbnails but is far inferior to EXIV2 if one has to manipulate the EXIF informations somehow. I will probably end up using both, GDI+ and EXIV2 and use both for the parts they are best in. Since I need GDI+ for drawing operations anyway this is no problem for me. I was just curious if i could speed up the EXIV2 thumbnail retrieval in some way.
RE: PErformance bei Ermittlung eines PreviewImages / EXIF Thumbnails - Added by Andreas Huggel over 12 years ago
FullImage = Exiv2::ImageFactory::open(lpszFilename);
FullImage->readMetadata();
Exiv2::ExifData &exifData = FullImage->exifData();
Exiv2::ExifThumbC ExifThumb(exifData);
Exiv2::DataBuf DataBuffer = ExifThumb.copy();
hHeapMemoryBlock = HeapAlloc(GetProcessHeap(), 0, DataBuffer.size_);
memcpy(hHeapMemoryBlock, DataBuffer.pData_, DataBuffer.size_);This is taking about 84ms per Image in average on my computer. So it's faster than using the PreviewManager in 1.18 but slower than the old functionality in 1.17.
This is about as fast as it gets with Exiv2 without tuning the library itself and even then I believe it will always remain many times slower than GDI+. 2ms sounds really fast, although of course in order to get just the thumbnail, one can bypass all of the decoding complexities. Exiv2 on the other hand will always parse the whole Exif block incl. IFD0 and its sub-IFDs and the Makernote.
I'll build a similar experiment on Linux and get some profiling data to see if there is any potential for a quick improvement, but it will only be later this week.
Andreas
RE: PErformance bei Ermittlung eines PreviewImages / EXIF Thumbnails - Added by Andreas Huggel over 12 years ago
I've run a similar test on Linux, using Exiv2 from SVN (r1773). It extracts the Exif thumbnail using ExifThumbC
and copies it into a DataBuf
, just like the code above (without the last two lines) for all test files in a loop. The test was run on a 3.5 year old Dell Dimension 9100 PC with a 3GHz Pentium D processor, 3GB of memory and 2 mirrored SATA disks (Linux software RAID), running Debian Linux.
The test data consisted of 208 JPEG images (43 Canon PowerShot A430, 52 Canon PowerShot S40, 34 Panasonic DMC-FZ7, 27 Sony DSC-W7, 52 NIKON D70), average image size is 2GB, all have an Exif thumbnail.
The results are very different from those reported from Windows. The good news is that the performance on Linux is much better. But it's not clear why there is such a large difference between Windows and Linux.
In this test, thumbnail extraction on average took 16ms per image. 80% of the time was spent reading the file from the disk: subsequent runs without flushing the filesystem cache took only 3ms per image. After flushing, the time reproducably is back to about 16ms.
Note: Exiv2 reads JPEG files only up to the Exif APP segment. It doesn't read the image data. The Exif segment is usually located within the first 100kB of the image. So for reading JPEG images, the image size is not relevant.
Profiling data is attached (use eg kcachegrind to view it). A considerable amount of time is spent to allocate and free small amounts of heap memory. Is memory management slower on Windows? Or is the Delphi overhead causing the difference?
-ahu.
RE: PErformance bei Ermittlung eines PreviewImages / EXIF Thumbnails - Added by Andreas Kolbow over 12 years ago
Thanks a lot for your feedback. Your tests seem to confirm the the thumbnail extraction per se is quite fast but the filesystem overhead on windows seems quite high. 3ms is about the time I got with the GDI+ method on my computer. Well still 1ms difference but that may be because of hardware differences.
My tests were done on a Lenovo 3000 N200 (Intel Core2Duo T7300 @ 2.00GHz, 3GB RAM, SATA (no raid) HDD, Windows Vista)
Unfortunately I have some other tasks I need to finish this week so I can't really test any further but I'll try if I can do some more evaluation on the weekend as I am interested in the reason for the difference as well.
RE: PErformance bei Ermittlung eines PreviewImages / EXIF Thumbnails - Added by Andreas Huggel over 12 years ago
Your tests seem to confirm the the thumbnail extraction per se is quite fast but the filesystem overhead on windows seems quite high.
Yes, that's also the conclusion I arrived at in the meantime. I've re-run the same experiment on a new laptop running Windows Vista and the results really seem to indicate that
- parsing the metadata and extracting the thumbnail is fast
- most of the time is spent reading the file, if it is not in the filesystem cache yet
- Windows NTFS is much slower than Linux EXT3
Environment | First run (ms) | Subsequent runs (ms) |
old PC / Linux / g++ | 16 | 3 |
new Laptop / Vista / g++ (MinGw) | 94 | 2 |
new Laptop / Vista / MSVC 9 | 91 | 4 |
"First run" is the first time the test program was run after re-booting (Windows) / clearing the filesystem cache (Linux), subsequent runs obviously don't read the files from the disk anymore.
As far as the GDI+ methods are concerned, they must also be using the filesystem cache and then the results are similar and make sense.
If you have any ideas how reading from the file could be optimized, particularly for Windows filesystems, I'd be interested to hear.
Andreas
Just for the record, this is the test program I used:
// ***************************************************************** -*- C++ -*- // thumbs.cpp, $Rev$ // JPEG parsing and thumbnail retrieval performance testing #include <exiv2/image.hpp> #include <exiv2/exif.hpp> #include <iostream> int main(int argc, char* const argv[]) try { if (argc < 2) { std::cout << "Usage: " << argv[0] << " file [...]\n"; return 1; } for (int i = 1; i < argc; ++i) { Exiv2::Image::AutoPtr image = Exiv2::ImageFactory::open(argv[i]); image->readMetadata(); Exiv2::ExifThumbC exifThumb(image->exifData()); Exiv2::DataBuf thumb = exifThumb.copy(); // Exiv2::writeFile(thumb, std::string(argv[i]) + "-thumb.jpg"); } return 0; } catch (Exiv2::AnyError& e) { std::cout << "Caught Exiv2 exception '" << e << "'\n"; return -1; }