mmap and exiv2

Added by Robin Mills about 1 year ago

I received this email from Terence Tay at AlienSkin software:

We ran into an exiv2 issue that I thought I'd run by you. The issue is the use of mmap in exiv2. I understand that this improves performance and that's great.

But we've noticed that when the file is on a network-mounted drive (like a Windows share), memory mapping is not reliable because the network is not reliable. Typically, users may put their laptops to sleep, or pull out a network cable, or the drive may disconnect or any number of things could happen that causes the connection between the network drive and the program to break. We noticed that when this happens, mmap generates SIGBUS, which then crashes the program. Typical C-style FILE* operations will fail too, but they return an error code and do not crash the program.

As a command-line tool, most exiv2 users probably never see this problem. But we've incorporated it into our products and we expect the program to be running continuously for hours and days. Under these stressful conditions, we experience this mmap crash a few times a week.

Do you have any thoughts about this? Have you heard of this from other users?


Replies (11)

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

My Reply:

This mmap issue is interesting. As you say, servers and other processes can be alive for long durations during which the network can disappear and result in signals. There is a build switch EXV_HAVE_MMAP which is set automatically by the build environment (CMake or AutoTools) in the file exv_conf.h (or exv_msvc.h for MSVC).

You can unset it explicitly in include/exiv2/config.h. Our code in include/exiv2/config.h reads the generated file exv_conf.h which defines EXV_HAVE_MMAP. Go to the bottom of include/exiv2/config.h and add the code:

#ifdef EXV_HAVE_MMAP
#undef EXV_HAVE_MMAP
#endif

My other suggestion is to avoid keeping files open for longer than necessary. Open a file, read the metadata and close it again. When you want to update the metadata, open/write/close. We can discuss this in more detail if you wish to pursue this idea.

I can only recall a couple of discussions about mmap and remarkably they arrived on the same day almost 2 years ago. There was a very long discussion about mmap from Ubuntu on a Samba/Windows Server. Thomas (a team member) and Thoralf (the user) discussed it for ages. http://dev.exiv2.org/issues/1043 Rebooting the server fixed it. Something similar was discussed here: http://dev.exiv2.org/issues/1042

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

Terence replied:

I read somewhere that when EXV_HAVE_MMAP is unset, the entire file gets read into memory and there is a major speed problem?

We used a similar workaround in our code to read the entire image to RAM before calling exiv2. This fixes the crashes but speed was awfully slow. Not to mention lots of RAM used since our program multithreads and reads multiple images simultaneously.

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

My Reply:

In general, we never read the whole file. We use FILE* efficiently and use seek()/tell() to avoid unnecessary I/O. The most common file format is JPEG and usually the metadata is in the first 100k bytes. I/O per file is usually less for reading.

Of course, writing is another matter. To rewrite a file, when mmap is not available, we must read/write every byte.

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

Terence replied:

We use exiv2 on a lot of raw images. There are lots of non-standard locations where raw images stash their metadata and exiv2 is good at extracting them. Raw files can be very large too. Is exiv2 able to extract metadata without reading the entire file?

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

My Reply:

Exiv2 doesn’t read the entire file. Most raw images are variants of the TIFF format. So we have to read all the directories in the TIFF. However we only dereference tags that deal with metadata. TIFF directories are chained together and can be anywhere in the file. When we go from one directory to another, we seek().

In a typical raw image there are about 200 tags. So 200x12 bytes per tag. 2400 bytes. The IPTC, ICC, Exif, XMP blocks are typically about 30k in total. The total IO on a 20mb .NEF is usually less than 100k. That’s all. Total memory used by an open file in similar. On my MacBook Pro, exiv2 reads about 1000 jpegs/second and 500 Nikon Raw Images/second. Here’s a directory with 100 .NEFs (2Gb in total).

692 rmills@rmillsmbp:~/temp/Raw $ du -m .
2131    .
693 rmills@rmillsmbp:~/temp/Raw $ ls -1 *.NEF | wc
    100     100     703
694 rmills@rmillsmbp:~/temp/Raw $ time exiv2 -pa --grep Software *.NEF | wc
    100     500    9200

real    0m0.223s
user    0m0.201s
sys    0m0.027s
695 rmills@rmillsmbp:~/temp/Raw $ 

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

Asdiel replied:

Terence asked me to look into recompiling Exiv2 without mmap support to fix our crashes.

I looked around the code in basicio.cpp and it looks from inspection that EXV_HAVE_MMAP does not play a role in Windows. There seems to be no alternative to using file mapping in Windows. By default EXV_HAVE_MMAP is not defined in Windows (but file mapping is used in Windows).

Also, looking at what happens if I undef EXV_HAVE_MMAP in OSX, it will go thru a code path that reads the whole file into a dataBuf, right under the line:
// Workaround for platforms without mmap: Read the file into memory

I can see that there are some files that don't use io.mmap(), JPEGs, BMPs etc, but most of the file types will call io_->mmap() when calling XYZParser::decode(..) which as I said before, will load the whole entire file. This includes Tiffs, CR2s, RAFs, ORFs etc.

Am I seeing this wrong? It looks like some decoders can efficiently seek around the file, while others use mmap, which either uses file mapping, or read the whole image into a buffer.

Could you please comment, maybe there is something that I'm missing.

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

My Reply:

In general, I/O is abstracted into the BasicIo interface. Reading/writing images is abstracted into the Image interface. So most images (TiffImage, JPEGImage etc) have no idea if the source of the data is a FILE* (FileIo), a remote connection (RemoteIO) or a memory mapped file or an in-memory byte buffer (MemIo).

I think the simple way forward is to disable EXV_HAVE_MMAP (and EXV_HAVE_MUNMAP) and build it. I suggest you instrument the code. In class FileIo, add member variables such as uint32_t totalRead_ and set it to zero in the c'tor. In FileIo::read, totalRead_ += read; In FileIo::close, std::cout << “totalRead = “ << totalRead_ << std::endl;

I am confident that we are not doing excessive I/O. I read the metadata from 100 files of 20mb (Nikon NEF files) in 0.2 seconds on my laptop this morning.

The setting for EXV_HAVE_MMAP for MSVC are in the file include/exiv2/exv_msvc.h. I think EXV_HAVE_MMAP is normally set because Windows has supported this forever. However, you can change it.

I attach a patch (relative to the trunk at r4630) in which I have disabled EXV_HAVE_MMAP. I have discovered that we are both right!

You’re right. Without MMAP, basicio does reads the whole file.
I’m right. Without MMAP, we can read all of the metadata in a 20Mb .NEF in 150,000 bytes.

Here’s the evidence (on MacOS-X 10.12) in which I am right.

511 rmills@rmillsmbp:~/gnu/exiv2/trunk/build $ bin/Debug/exiv2 -pR /Volumes/iMacHD/Users/rmills/Pictures/Photos/2016/Raw/DSC_0002.NEF
totalRead_ = 0
totalRead_ = 51
totalRead_ = 59
totalRead_ = 59
STRUCTURE OF TIFF FILE (II): /Volumes/iMacHD/Users/rmills/Pictures/Photos/2016/Raw/DSC_0002.NEF
...
totalRead_ = 156783
$ 

Here’s the evidence in which you are right.

$ exiv2 -pa --grep Software /Volumes/iMacHD/Users/rmills/Pictures/Photos/2016/Raw/DSC_0002.NEF
totalRead_ = 0
totalRead_ = 51
totalRead_ = 59
totalRead_ = 59
totalRead_ = 23538963
totalRead_ = 23538963
Exiv2 exception in print action for file /Volumes/iMacHD/Users/rmills/Pictures/Photos/2016/Raw/DSC_0002.NEF:
/Volumes/iMacHD/Users/rmills/Pictures/Photos/2016/Raw/DSC_0002.NEF: Call to `FileIo::mmap' failed: Undefined error: 0 (errno = 0)
$ 
Several obvious points:

1) It threw an exception!
2) It read the whole file twice
3) It allocated a large memory buffer to hold the file

I ran the command in the debugger. The exception is from TiffImage.readMetadata() which uses TiffParser::decode which requires success to return from mmap().

I can’t get sucked into this at the moment because I’m working flat out to finish v0.26. If you want to discuss this in the next few weeks, please contact Andreas as I believe he wrote the TiffXxxxx family of classes.

asdiel.patch Magnifier (1.4 KB)

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

I have investigated the exception. I am to blame. I changed the semantics of eof() during the last year. BasicIo.cpp:Line 528 should not test eof().

        if (error() /* || eof()*/ ) {
That gets the code running, however we are indeed reading the whole file into memory (twice) and have allocated a large memory buffer.

However, it makes libexiv2 independent of the network resilience which is the point of this discussion.

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

I've reported issue #1244 about not setting EXV_HAVE_MMAP and getting exceptions. Fix submitted r4633.

I've successfully run the test suite with EXV_HAVE_MMAP unset in include/exiv2/config.h:

#undef EXV_HAVE_MMAP
#undef EXV_HAVE_MUNMAP
//
// That's all Folks!
#endif // _CONFIG_H_

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

I have written a proposal to reduce the Memory and I/O demands of class FileIo when EXV_HAVE_MMAP is not set. http://dev.exiv2.org/issues/1244#note-2

My proposal cannot be implemented for v0.26. It is too much work and risk to undertake such a change so late in the v0.26 project. The proposal has the desirable feature that no changes will be required in any TiffXxxxx class code.

RE: mmap and exiv2 - Added by Robin Mills about 1 year ago

Discussion with Asdiel Echevarria

We really like your idea and implementation for reading only the metadata blocks while still using File I/O and we are thinking to try to back port it to 0.26 once 0.26 is released. We will of course share it back in the repository in case you guys do a release between 0.26 and 0.27.

My reply:

I’ve backported the necessary code from v0.26 to v0.25. The changes to make that happen are mostly in src/*image.cpp and src/basicio.cpp (and their .hpp companions). It’s not as trivial as I say because you have to update the build and other consequential magic. There are new files in v0.26 (src/webpimage.cpp, src/ini.cpp). However I’ve done everything in about two hours. It builds and executes the v0.25 test suite without crashing. The test suite reports various matters which have been fixed in v0.26. The formatted output from the command exiv2 -pS is slightly different in v0.26. For certain this is sufficient to be sent to your test/QE people. http://clanmills.com/exiv2/exiv2-0.25+.tar.gz and I attach a patch for v0.25.

It reads TIFFs over the internet very efficiency. I added instrumentation to HttpIo to see the blocks being fetched. 11 blocks of 1024bytes.

1052 rmills@rmillsmbp:~/gnu/exiv2 $ ssh secret@clanmills.com ls -alt www/files/Reagan.tiff
-rw-r--r-- 1 secret secret 8628164 Oct 16 10:45 www/files/Reagan.tiff
1053 rmills@rmillsmbp:~/gnu/exiv2/v0.25/build $ bin/Debug/exiv2 -pa --grep Software http://clanmills.com/files/Reagan.tiff
HttpIo::HttpImpl::getDataByRange: 0,0
HttpIo::HttpImpl::getDataByRange: 8416,8416
HttpIo::HttpImpl::getDataByRange: 8417,8417
HttpIo::HttpImpl::getDataByRange: 8418,8418
HttpIo::HttpImpl::getDataByRange: 8419,8422
HttpIo::HttpImpl::getDataByRange: 8423,8425
Exif.Image.Software                          Ascii      29  Adobe Photoshop CS Macintosh
1054 rmills@rmillsmbp:~/gnu/exiv2/v0.25/build $
If/When you make the changes for the network drive, I will be very happy to accept a patch. I’ll review and test it, then put it on the trunk after v0.26 has shipped. From my point of view, there is no hurry at all with this.

Incidentally, I pulled down all the raw images yesterday from here: https://www.rawsamples.ch/index.php/en/ Exiv2 reads all 322 without a single stumble when they are on local storage. https://www.rawsamples.ch/index.php/en/ The project for 2017 to enhance our raw image support and test will investigate that every image can be read efficiently over the internet. I’m planning to recruit a Google Summer of Code student for that project. So it would be good to have your patch by May 2017.

529 rmills@rmillsmbp:~/gnu/exiv2/trunk $ time build/bin/Debug/exiv2 -pa -g Software http://clanmills.com/files/Reagan.tiff
Exif.Image.Software                          Ascii      29  Adobe Photoshop CS Macintosh

real    0m1.582s
user    0m0.014s
sys    0m0.012s
530 rmills@rmillsmbp:~/gnu/exiv2/trunk $ time curl -O http://clanmills.com/files/Reagan.tiff
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 8425k  100 8425k    0     0   787k      0  0:00:10  0:00:10 --:--:-- 1601k

real    0m10.745s
user    0m0.074s
sys    0m0.319s
531 rmills@rmillsmbp:~/gnu/exiv2/trunk $ ls -alt Reagan.tiff
-rw-r--r--+ 1 rmills staff 8628164 Oct 21 12:00 Reagan.tiff
532 rmills@rmillsmbp:~/gnu/exiv2/trunk $

(1-11/11)

Redmine Appliance - Powered by TurnKey Linux