technique for logging (on the client) slow server responses

Thu Oct 20 18:10:40 GMT 2005

Some of you may find this technique useful for finding out when the server
is sending slow responses under load - and on which requests.

I added the corresponding code in the Linux cifs client version 1.38 source
and tried to make it reasonably easy to enable for people doing performance
analysis of the client or server or network.

If the client is compiled with CONFIG_CIFS_STATS (one of the  kernel
configure options in the cifs kconfig menu) and if you enable the
additional checks at compile time by e.g.
      #define CONFIG_CIFS_STATS2
(e.g. at the beginning of a cifs header such as fs/cifs/cifsglob.h)  then
the following extra information will be displayed in
/proc/fs/cifs/DebugData for each server you are mounted to:
1) the number of application threads "In Send"  (ie requests that the
client is currently trying to write to the tcp socket) - if this number is
high then it indicates the tcp socket is not accepting data fast enough.
2) the number of application threads "In MaxReq Wait" if this number is
non-zero very often, you might consider increasing the "cifs_max_pending"
parm when you load the cifs module on the client (assuming your server,
like Samba, can handle more than the default, 50 simultaneous requests)

In addition there is new data that can appear in dmesg which I am finding
useful as I try to figure out what is causing Samba to respond a little
slower than I expected in loopback configurations.

With CONFIG_CIFS_STATS2 enabled, cifs will log a debug message when a
response to a cifs request takes longer than 1 second (locking requests are
not logged as it is ok for them to block on the server).     The entry is
of the form
      "CIFS slow rsp: cmd <smb command code>  mid <smb multiplex id> A:
<time since request began on client>
            S: <time since the socket write returned indicating that the
request was actually sent to the server>
            R: <time since the response was received by the demultiplex
thread>

The times are expressed in jiffies (the standard kernel unit of time).
If A is large but S and R are not, that might indicated that the tcp stack
is having trouble sending the request (the write to tcp is slow, perhaps
problems with loss/retries on the network or slow link, or tcp buffers too
small or problems with the server not acking fast enough).    If S is large
that indicates that the server is slow (the response is not arriving to
cifsd on the client fast enough) - this might indicate an overloaded server
disk.   If R is large then the data got to the client but then could not be
delivered fast enough to the right application, perhaps because the client
is overloaded and the application that issued the file api is not getting
scheduled - and therefore the client cpu may be overloaded.   If the 1
second threshold is too short (I hit this many times running dbench with 50
threads or more even from just one client to Samba 3)  then it can be
changed by altering line 96 of fs/cifs/transport.c from HZ (one seconds
worth of jiffies) to a larger or smaller value.

If we use this in conjunction with server performance analysis we should be
able to optimize Samba better for the testcases like dbench, fstress etc.
which launch lots of threads on the client hitting the smb server at one
time.

Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: sfrench at-sign us dot ibm dot com