[Samba] Netware CIFS nlm - linux samba

Glen Davison glen at maths.unsw.edu.au
Wed Sep 10 07:49:54 GMT 2003


I posted this question on 15/8/03 - almost a month ago.  Since I've had no 
response, I assume that very few people have seen the problem.  So I'll tell 
what we have discovered in the meantime.

Here's the original post:
-------------------------------------------------------------------------------------------
From: Glen Davison (glen at maths.unsw.edu.au)
Subject: [Samba] Netware CIFS nlm - linux samba 
Date: 2003-08-15 01:40:06 PST 

Dear Gurus,

We're having bizarre problems/behaviour.  Admittedly we have an unusual 
set-up:

 - users on linux desktops (RedHat/KDE) mounting files over SMB using 
samba-2.2.5-10 -client and -common rpms. 
 - files are on a SAN, clustered behind 2 netware servers (6.5), wihch run the 
cifs.nlm (netware guy has gone home - can't tell you the version just now
)

Files are spontaneously changing modification timestamps - anywhere between 
about 1930 and 2040 (AD, not hours).  Or more likely *presenting* those 
timestamps most of the time, though now and then the correct timestamp will 
swim into view briefly.

example:
[glen at haiku glen]$ ls -l home/test333*
-rwxr-xr-x    1 glen     users           0 Aug 12 16:44 home/test3332
-rwxr-xr-x    1 glen     users          11 Aug 12 16:43 home/test3333

[glen at haiku glen]$ ls -l home/test3333
-rwxr-xr-x    1 glen     users          11 Sep  2  1992 home/test3333

[glen at haiku glen]$ ls -l home/test3332
-rwxr-xr-x    1 glen     users           0 Oct 26  1992 home/test3332

[home is the mount-point, or rather a symlink down thru the mount a little 
way]

Most newly created files seem to have the problem straight away.  (But the 
bulk of the files were rsync'd across from a Tru64 filesystem a month ago)

We have tried versions 2.2.8a and 3.0.0beta of smbclient / smbmount; 2.2.8a 
was the same; 3.0.0 started with promising results, but it eventually did the 
same timestamp trick (maybe less frequent??) but it also dies somehow after 
about 30 mins and has to be remounted.

We believe we have narrowed this behaviour down to only linux samba clients 
talking to the netware cifs nlm.

To add to the pot: we have also had a handful of files apparently change 
filename spontaneously - so that they start with '..'  In most cases, they 
started as .xyz and became ..xyz  The only processes which touched those 
files should have been reads - no writes.  This may be a red herring - may 
not be samba-related.

The timestamp issue is wide-spread, the filename problem is rare.

So, has anyone seen anything like this?  Can you explain what causes it? 
 And is there a solution?


TIA
Glen
-------------------------------------------------------------------------------

What we have been able to discover since then, by experiment, research and 
guesswork follows.  Note: a lot of this was worked out by a colleague, 
including ripping off half this email itself.

It seems that netware NSS stores modification (& other?) time-stamps in the 
directory-file (the file which *is* the containing directory) like windows 
does, whereas unix filsystems store this info in the file's inode.  When the 
CIFS nlm on netware receives a file query, it looks at the file itself which 
doesn't have the timestamp, and hence it returns some sort of 
bogus/semi-random/null result, and we see the stupid timestamp.

But if samba (or cifs.nlm) gets a directory query, then a file query, within a 
time-window smaller than the time that it caches results for (1 second by 
default I think) then the file query gets the correct time, remembered from 
the directory query.

This explains the behaviour seen above - `ls -l x*` does a directory query for 
the glob expansion, then a file query on each resulting file, and hence the
correct time!

More examples of successes and failures:
ls -l file*                          -- correct timestamp
ls -l file1                          -- wrong timestamp
ls -l $(echo file*)                  -- correct timestamp
ls > /dev/null; ls -l file1          -- correct timestamp
ls > /dev/null; sleep 2; ls -l file1 -- random timestamp

By extending smbmount's ttl (length of cache) option, there are obviously poor 
results:
-bash-2.05b# ls -l ?
-rwx------    1 root     root            6 Aug 20 16:10 x
-bash-2.05b# ls -l x
-rwx------    1 root     root            6 Aug 20 16:10 x
-bash-2.05b# touch x
-bash-2.05b# ls -l x
-rwx------    1 root     root            6 Nov 24  1922 x

So here a file was modified and the kernel smb cache now forgot
the timestamp (due to modification) and assigns another random
timestamp.  This is not good at all.

Further, when first you mount a file system and try to access a file,
you may well get the message that the file does not exist!

In C, unix uses the 'stat()' system call for the file query.  To get correct 
timestamps, do a opendir(), readdir() beforehand.  Attached is C source to 
see the difference.  The opendir code is currently commented out, so it will 
give bad timestamps as is.

To get correct NSS timestamps in perl:
local *DH;
opendir DH, $dir;
$ff = readdir(DH);
# make sure this happens within ttl (1 sec) of opendir:
$mtime = lstat("$dir/$ff")[9];

Unfortunately, once we worked out how to see the true mod. timestamps, we 
discovered that there is a SECOND timestamp issue: when a file on NSS is 
touched in some way from linux thru samba & CIFS, the 'true' (NSS) timestamp 
can sometimes change in a random way!  [This may even happen when the file 
has not been touched!!  Circumstantial evidence points to this happening to 
some files which haven't been written to at all!]

I'm guessing this is also related to the directory file vs inode 
incompatility, but can't provide much detail.  I do know that doing a touch 
on the same file any number of times within about 1 hour keeps giving the 
same NSS timestamp.  Whether that's because it keeps mangling the timestamp 
in the same way, or because it loses interest in changing the dir-file 
(another caching issue perhaps?) I don't know.  It seems to happen to about 1 
file in 20.  Later on, the same files will touch correctly, and a different 1 
in 20 files will misbehave.

In the scenario where smbd is file-serving from linux, and linux clients are 
smbmounting it, this problem doesn't seem to occur.  
<wild speculation mode> Our best guess is that samba finds some way to squeeze 
the proper inode info, like timestamps, thru the CIFS protocol.  Perhaps only 
if it knows that the client is linux/samba too. </spec...>

I hope this might be useful to others at some point.  We are almost certainly 
going to go a completely different way to solve our problems, anyway.

Glen
:)

-- 
Glen Davison			glen at maths.unsw.edu.au
Computer System Administrator	phone: +61 2 9385 7018
Maths, UNSW			fax:   +61 2 9385 7192


More information about the samba mailing list