avoiding stat() races

John E. Malmberg wb8tyw at qsl.net
Sat Nov 11 19:13:47 GMT 2000


I have been following this for a bit, but not too closely, as this issue
also affects OpenVMS quite a bit.

In OpenVMS, the ino_t is a 48 bit value, normally expressed as three 16 bit
words.

Because this does not map well to the existing SAMBA UNIX code, I made
adjustments so that only the first 32 bits are compared.  This seems to be
good enough.

The missing word contains a serial number that contains a count of the times
that the rest of the "inode" number has been used.  This was implemented in
OpenVMS to possibly eliminate the race conditions that Okuyamak refers to.

To do the compare correctly would require a macro similar to a memcmp() that
can be defined in config.h for the platform.

I do not know if the use of VFS would affect this, I have not looked into
them.

-John
wb8tyw at qsl.network


----- Original Message -----
From: <okuyamak at dd.iij4u.or.jp>
> Dear Timothy
>
> >>>>> "CTD" == Cole, Timothy D <timothy_d_cole at md.northgrum.com> writes:
> >>
> CTD> This is reasonable.
> CTD> I still think keeping a separate 'reference counter' regardless of
> CTD> the availible time precision would be preferable.  It's a nice hedge
against
> CTD> access times passing timestamp resolution (or more important,
accuracy),
> CTD> which _will_ keep happening.
>
> ... Maybe we are talking about slightly different "REFERENCE COUNTER".
>
> You are talking about 'reference counter per file', individual one, right?
>
> I'm talking about 'reference counter per system'. Single global
> reference counter for one entire Operating system. Any action against
> filesystem will increase reference counter, until 'time' changes.
>
> Let's think about this example case:
>
> Suppose we are having system which manages timestamp with 32bits.
>
> You made file name './afo' at time 0x00001111.
> Then you changed './afo' while we are still at time 0x00001111.
> Currently we have no way of finding whether './afo' have changed or not
> from time stamp.
>
> Let's add 'Reference counter per file' to system.
> Now we can find out that first './afo' have time 0x00001111 and
> counter 0x00000000. Second './afo' have time 0x00001111 and counter
> 0x00000001.
>
> But what whill happen if you
> 1) create './afo'
> 2) delete './afo'
> 3) create './afo' again
> within same timestamp. And what's so unlucky was, that system attached
> same i-node for 1st step and 3rd step ( this can happen, thought it is
> vary rare case ). The only clue we have is timestamp and reference
counter.
>
> 1) create './afo' : time = 0x00001111, counter = 0x00000000
> 2) delete './afo'
>   < we lost all information about ./afo now >
> 3) create './afo' : time = 0x00001111, counter = 0x00000000
>
> Now we have no way of finding difference between 1st and 3rd './afo'.
>
>
> If we choose 'Reference counter per system' to system, story differs.
> System will count up reference counter while we're in same time stamp.
> Now we'll have
>
> 1) create './afo' : time = 0x00001111, counter = 0x00000000
> 2) delete './afo'
>   < this action was counted as 0x00000001 >
> 3) create './afo' : time = 0x00001111, counter = 0x00000002
>
> As result, for 3rd change, we'll get reference counter different
> from 1st file.
>
> By this way, we can keep 'order' of file system manipulation into
> the timestamp. All the file have correct change order.
>
> And once this was kept, we can merge timestamp and reference counter
> into one field, for any comparison works correctly regardless of
> timestamp accuracy and reference counter, without deep thinking.
>
>
> If I remember right, this was first found by..... I'm sorry I forgot
> the name, the person who made LaTeX ( Lamport ... ? ).
>
> best regards,
> ----
> Kenichi Okuyama at Tokyo Research Lab. IBM-Japan, Co.
>
>





More information about the samba-technical mailing list