avoiding stat() races

John E. Malmberg wb8tyw at qsl.net
Sat Nov 11 21:27:53 GMT 2000


"Kenichi Okuyama" <okuyamak#dd.iij4u.or.jp> wrote:

> >>>>> "JEM" == John E Malmberg <wb8tyw at qsl.network> writes:
> JEM> The missing word contains a serial number that contains a
> JEM> count of the times that the rest of the "inode" number has
> JEM> been used.  This was implemented in OpenVMS to possibly
> JEM> eliminate the race conditions that Okuyamak refers to.
>
> So, there already exists system. Very interesting.
>
>
> JEM> To do the compare correctly would require a macro similar
> JEM> to a memcmp() that can be defined in config.h for the platform.
>
> Since I don't know about OpenVMS, I might be mistaking, but
> according to what I remember, gcc have memcmp() as built-in function
> too. And if given length of memcmp() is fixed value, and is very
> small, gcc will inline the comparison.

For platforms where the SMB_INO_T is a natural integer size, the macro would
just define a normal compare.

Not all gcc variants have the ability to inline standard functions.  GCC for
VAX is notorious for adding in function calls for things you would expect to
be done with inline code, particularly when there are assembly instructions
to do the task.

> So, if you're having smart enough compiler, calling memcmp() will
> not require any big overhead. Just by making field bigger, and
> compare them using memcmp(), will give more trustable status cache
> against file system.

The current DEC C compiler for ALPHA can do this.  There are also VAX and
ALPHA specific macros that could be built into the config.h to accomplish
the same task.

> ( If I understand your explain right, this timestamp will
>   tell 'validness'. It will truly give you that cache is valid,
>   while unix's time stamp will only tell about 'invalidness'.
>   'it was not proven to be invalid' only means "we don't know" )

It is not a timestamp, just a serial number for what would be equivalent for
the UNIX inode.

> # or, how about making SMB_INO_T as 64bit, and convert the time
> # value from 48bits to 64bits. I think this also well work good, for
> # we will not control timestamp of file anyway... by the way, is
> # that 48bit file timestamp unsigned? or signed??  I'm interested in
> # design of the system.

The ino_t on OpenVMS contains the following:

Relative volume number in case there are more than one physical disk bound
in a volume set.

Starting block in the master bitmap index where the file information starts.

Revision number of how many times that starting block in the master bitmap
index has been used.

> ## If it was signed, first make 32bit signed value from first 2 words.
> ## Then, expand it to 64bit signed valud, then shift 16bit to left,
> ## and let last 1 word fit to the lsb 16bits.
> ## If it was unsigned, story is even more simpler.

It is not only unsigned, it is a packed structure.

The problem with expanding it to 64 bits is that it is frequently imbedded
in structures, most prominently the stat_t structure.

The underlying routines that fill in the values are expecting there to be 48
bits.

-John
wb8tyw at qsl.network







More information about the samba-technical mailing list