[jcifs] Curious "race" condition

Michael B Allen ioplex at gmail.com
Fri Oct 22 09:36:05 MDT 2010


On Thu, Oct 21, 2010 at 5:45 AM, André Warnier <aw at ice-sa.com> wrote:
> Hi.
> I don't know if this is the right list for this, but I figure that there are
> enough Samba experts here to maybe give me some pointers (or tell me I'm
> wrong and need to look somewhere else).
>
> Here is the issue :
>
> Program A runs on a Windows system.  It creates new files on a network
> drive, which is actually situated on a Solaris machine and shared via Samba.
> The creation sequence is as follows :
> - open the new file for output, with a name
> "//servername/sharename/xxxx.dat.tmp" (where "xxxxx" is guaranteed to be
> unique each time)
> - write data to the file
> - close the file
> - only if no errors occurred, rename the output file from "xxxxx.dat.tmp" to
> "xxxxx.dat"
>
> At the same time, program B runs on the Solaris machine.
> It regularly scans the same (for him, local) directory, for files ending in
> ".dat".
> When it finds one, it opens it and reads it.
>
> This happens thousands of times per month without problems.
>
> But once in a great while (2-3 times per year, no more), program B reports
> an error and crashes.  The reported error leads me to believe that it finds
> a "xxxxx.dat" file that is either empty or only partially written.
> If we restart program B, it processes that same file properly.
>
> Considering the sequence of operations above, my understanding is that
> program B should never be able to find a "xxxxxx.dat" file that is empty of
> partially written.

So which is it? What is the condition of the file after the failure.
Does it contain any data, some data?

I think it is more likely that there is a code path where the file
creation stop looked like it was successful to your program A when in
fact there was just some kind of network or server failure.

You could add a step to open the .tmp file a second time, seek to the
end and check the last 16 bytes. Then rename it.

> But my question is : considering that this happens on a network share shared
> via Samba, is it possible due to some race condition or configuration issue,
> that the above may nevertheless "sometimes" occur ?

It shouldn't. But you probably should ask on the samba users list.

Mike

-- 
Michael B Allen
Java Active Directory Integration
http://www.ioplex.com/


More information about the jCIFS mailing list