Failed to rename file to
|Status:||Assigned||Start date:||06 Sep 2016|
|Assignee:||Robin Mills||% Done:|
I have a little problem creating JPGs files (and then updating metadata inmediatly). I mean, I first create the image file, and then I update the metadata. But, I do it in a loop, proessing 100 or more files. At some point (unfortunately, at random file), the resulting file size is zero. I found this issue in the forum:
While my version already has all patches, the problem continues. I captured the exception, and this is the message:
ABC.JPG7888: Failed to rename file to ABC.JPG. The created file is ABC.JPG, but I think, the library adds a temporal suffix before rename it back to ABC.jpg. And there should be the problem. It is possible that ABC.jpg was not deleted at all due to antivirus, or something else. I am working with large files (for example 20k*20k image resolution).
If you have any suggestion, I may try to do something. For example, use copy (with replace option) instead of rename. THen delete temp file. But it seems like I have to read the code in deep to know where to put it, without affecting the other functionalities. It scared me a bit.
Thanks in advance
#1 Updated by Robin Mills over 2 years ago
- Status changed from New to Assigned
- Assignee set to Robin Mills
- Target version set to 0.28
There's not much information here with which to make progress. When this matter was discussed in 2014 we didn't really solve it then and I haven't heard from George for some time. There were other changes made around the same time that are relevant: #984
1 Platform, build and version?
2 Have you tried the recent trunk builds available here: http://exiv2.dyndns.org:8080/userContent/builds/Categorized/
3 Have you run this with/without your virus checker running?
4 Are you using the exiv2.exe application or something you have written?
5 Are your files and temporary files on local storage or a network device?
6 Are your files and temporary files on the same storage (or network) device?
7 Is your application multi-threaded?
Let me guess that you are using Windows and building with MSVC. There's important code in src/basicio.cpp function
FileIo::transfer(BasicIo& src) concerned with renaming files. We're going to have to get that instrumented (peppered with trace/output code) to understand this issue.
You've seen the previous discussion. It's very difficult to make progress with an intermittent fault that I cannot reproduce. I think it's going to take a while to get this properly identified and isolated.
#2 Updated by Ritz Ahead over 2 years ago
Thanks for your feedback. Yeah, I think it is hard to reproduce. I cannot do it in my machine (in my machine it does not happen). It is happening in a remote machine. They are using SAN (something related to network filesystem). They also disable the antivirus and the problem persists. I think that it can be solved by using copy instead of replace.
I am using exiv2-0.25: the generated dll (libexiv2.dll). My app is multi thread, but, one thread is working with the files, the other thread is not doing anything (just waiting for stop button pressed). The platform is windows (dont know the version, cause the machine is not mine...I can ask later), and the code was compiled in visual studio 2015 c++.
#3 Updated by Robin Mills over 2 years ago
ABC.JPG7888: Failed to rename file to ABC.JPG
Yes. The 7888 is the process ID. To minimise the risk of corrupting your file, the library makes a temporary copy using the file name and the process ID. That copy is updated and if everything's fine, your original file is deleted and the new copy renamed (or something like that).
Something in your system is occasionally breaking with this approach. Virus checkers are an obvious possibility. They monitor the file system for changes and go to inspect new files. They can lock the file in the process. If you have a cloud sharing system such as Dropbox monitoring your files, that could be another culprit. There are very good file system analysis tools for Windows from SysInternals.com (now part of Microsoft). FileMon.exe monitors one (or more) directories and logs the File I/O being performed by the Operating System and including the requesting process. It might be quite easy to identify the bandit.
Another, way to deal with this is to do your image processing in local storage on your computer and push the file to the server when you are done. You'll have a lot more control of events and tools.
A 20k*20k file is a whopper. Maybe your script has to be defensive. Before you update any file, make a copy. If exiv2 hammers your file, sleep for 10 seconds, restore the original and try again. I know that's not a fix - however it enables you to safely process your images AND you can log these events for analysis.
You're going to need a lot of patience. Intermittent faults are usually difficult to isolate.
#4 Updated by Niels Kristian Bech Jensen over 2 years ago
It seems to me that this issue, issue #1227 and this bug report from UFRaw: https://sourceforge.net/p/ufraw/bugs/409/ might be related. The common demoninator seems to be spawning processes/multi threading on the Windows platform. Could it be some sort of file lock mechanism playing havoc? I do not know anything about the Windows internals.
Niels Kristian Bech Jensen
#5 Updated by Robin Mills over 2 years ago
When I investigated #1207 for Gilles on MacOS-X, he asked me to use "a lot more than 3 threads". So I did. I got messages from the OS that I've never seen before about "resource not available". I've subsequently learned that MacOS-X has limits of about 256 open files per process. Those limits can be changed by the user, however that is not a useful fix. It's important that multi-threaded application control/limit the number of threads opening files. I think we'll learn that Windows requires similar treatment.
The Windows platform is also challenged by invisible processes such as Virus Checkers, Dropbox and other agents who may be locking the files. We are not alone. Something's going on.
Regrettably, I am flat out at present getting finished with v0.26. The test program samples/mt-test.cpp may be a useful starting point from which to investigate this issue.
#8 Updated by Robin Mills over 2 years ago
Thanks for the update.
As you know the world is not homogenous. I'm a compiler/interpreter engineer, so I like things to be engineered to work with 100% accuracy and every fault to be 100% reproducible. However, this issue is more like medicine. You take a pill and if you feel better you say "ah that fixed it". We will get to the bottom of this when the issue moves from "randomly reproducible on a user's machine" to "frequently happens on our buildserver". We have a project in the plan for v0.27 to improve our raw image support and testing. We have ExifTool's 7000 test files and my own 60,000 digital photos and there's a website in Switzerland with thousands of raw images from many camera manufacturers. We'll get the buildserver to work with them for 2 or 3 hours every day on Mac/Linux/Windows. We'll smoke him out.