avoiding stat() races (Was: RE: Samba login)
Cole, Timothy D.
timothy_d_cole at md.northgrum.com
Fri Nov 10 16:54:00 GMT 2000
> -----Original Message-----
> From: Kenichi Okuyama [SMTP:okuyamak at dd.iij4u.or.jp]
> Sent: Thursday, November 09, 2000 21:54
> To: samba-technical at samba.org
> Subject: Re: avoiding stat() races (Was: RE: Samba login)
> Dear Timothy,
> >>>>> "CTD" == Cole, Timothy D <timothy_d_cole at md.northgrum.com> writes:
> CTD> The essential problem here is that the verification of the statcache
> CTD> entry and the intended action really ought to be atomic (as you point
> CTD> -- but to do that, the statcache is going to need to "know" what
> needs to be
> CTD> done with the file once it is found.
> Why try to keep interface so less?
> "statcache" is really some sort of OBJECT. It have data of their
> own, and since so, have multiple way of accessing them. What you're
> trying to do is give parameter to the Object so that interface will
> call different method. That's what you're really doing.
This is a good point; I was thinking of the problem inside-out.
> and let each parameter be same as that of dos_open etc. that we're
> using now ( well, you can add one extra parameter, "pointer to
> statcache" as means of object, if you wish ).
Okay, so something to the effect of:
scache *scache_new(int flags);
int scache_open(scache *cache, const char *path, int flags, int
int scache_openi(scache *cache, const char *path, int flags, int
int scache_stat(scache *cache, const char *path, SMB_STAT_STRUCT
int scache_stati(scache *cache, const char *path, SMB_STAT_STRUCT
void scache_close(int fd);
void scache_destroy(scache *cache);
> Let "StatCache" take care of all the validness. Let him only return
> the valid information( like valid file descriptor, valid stat
> information, etc ). StatCache can now do lazy closing, sharing
> information among processes, etc, without effecting what's outside.
> CTD> Incidentally, regarding the need you indicated for increased
> CTD> resolution timestamps -- increasing timestamp resolution would only
> serve to
> CTD> "shrink" the window wherein the stat information can be erroneously
> CTD> identified as still valid.
> You should rather say, current timestamp only serve to give you
> information of "INVALIDNESS", like hash function.
I ... think that's what I said, isn't it? Mmm.. wait, we're looking
at 'valid' from different directions. Maybe 'not stale' would have been
better than 'valid' in this case.
> CTD> Since this would be mucking about with the kernel and filesystem
> CTD> layout anyway, I think an e.g. 32-bit "generation count" (not in the
> CTD> sense) on the inode, incremented with every modification would be a
> CTD> preferable (although still not ideal) solution.
> Well, what I ment as pico-sec is same thing. If you have accuracy of
> pico-second, and if access to file is being serialized somewhere,
> and as long as we do not have Peta-Hz order accessable HDD, we'll not
> have same valud for any accession, at least, for changing.
Eh, I still like the idea of having a generation count on files,
really. Even with picosecond timestamps and adequate serialization, the
method used to generate the timestamps (particularly in the picosecond
realm) may not be quite that accurate _or_ necessarily properly synchronized
-- consider weird situatinos with multiple.
It is academic to some extent, though, since once you get down to
picoseconds the window for races is small enough that they become impossibly
> What I belieave is, that we should have 256bits for timestamp. 128
> for describing over dot seconds, 128bit for under dot second. If
> system time does not have accuracy of 128bits, like ... 30 bits for
> example ... use 128-30=98bits for reference counter within that time
This is reasonable.
I still think keeping a separate 'reference counter' regardless of
the availible time precision would be preferable. It's a nice hedge against
access times passing timestamp resolution (or more important, accuracy),
which _will_ keep happening.
> #32bits was not enough for over dot seconds. nano-sec is within our
> # hand. So, we need at least 64 bits for over dots, and 64bits for
> # under dots, this is minimum. Biggest problem is that,
> # accuracy of time is increasing in 10bits every 15 year or so.
> # ( not as accurate as moore's law though )
> # So, if we are to face that fact, 128bit as total for time
> # treatment is not enough.
Well, that's the thing, though. The a 'generation count' would be
less sensitive to increases in timekeeping precision -- a 32 bit generation
count is probably enough for the century, at least.
Even in a fast, fairly heavily used system, it seems to me that 2^32
operations on a given file would take a considerable amount of time;
certainly longer than whatever the unit of timestamp resolution is.
More information about the samba-technical