samba 3.0.20 vs. 3.0.14a dbench failure

Steve French smfrench at austin.rr.com
Sun Aug 21 23:19:52 GMT 2005


I am seeing a possible problem with Samba 3.0.20 under stress.

Running dbench (70 processes, current Linux icifs mounted against smbd 
on localhost) against Samba 3.0.20 smbd failed with a few read failures 
(I did not catch debug information to see what the failures were), and 
also what appear to be unexpected sharing violations during the cleanup 
phase.   This works on the same system when running smbd from Samba 
3.0.14a (instead of 3.0.20).    I also tried with smbfs which seemed to 
have problems during the cleanup phase.   I also did see one run against 
3.0.20 in which a command timed out (ReadX I think) which would indicate 
a response taking longer than 15 seconds, which I had not seen against 
3.0.14a.

This is not enough information to point to the exact failure, but taking 
snapshots of the mid queue from time to time randomly on the client side 
(with cifs vfs mounted to 3.0.20) show various inflight requests which 
already had taken longer than 2 seconds, and some even taking longer 
than 4 seconds which is far longer than I would expect.  These slow 
responses seem to come in bursts with more than one slow response 
perhaps holding up others. The requests that I have logged taking more 
than 4 seconds to Samba 3.0.20 are ReadX (typically for 16k) size), 
OpenX, and Transact2 (do not know which subcommand).    These tests were 
run against localhost on a 1.2Ghz Pentium 3 system which should be able 
to handle 70 processes for dbench (and has done fine with Samba 3.0.14a).

Not enough information yet, but it would really help if there were a way 
to log response time averages and high water mark response time for 
various types of SMBs --- on the smbd server side (ethereal can do this 
but it might affect the response times).     Any ideas how to get what 
smbd thinks the response time for various request types are?


More information about the samba-technical mailing list