[linux-cifs-client] difficult "bug" and kernel/network missbehavior with system freeze

Christian Hartmann cornbob at web.de
Fri Mar 24 13:57:19 GMT 2006


Hello list,

I have searched this list, the kernel mailinglist and many other boards and
forums for about 3 month now and I have not found an answer for the following
problem:

We have samba servers with kernel 2.6.15 running and some (more than two)
clients, which ran into problems.

This clients are using cifs-shares from a Samba Server with version

cat /proc/fs/cifs/DebugData
Display Internal CIFS Data Structures for Debugging
---------------------------------------------------
CIFS Version 1.39
Active VFS Requests: 0
Servers:
1) Name: 192.168.x.x  Domain: EEIS Mounts: 1 OS: Unix
        NOS: Samba 3.0.9-2.6-SUSE       Capability: 0x80e3fd


An error occurred and this is also reproducible, which freezes the linux kernel
- only ping to this samba-cifs-client is working, the machine is up and running,
but it has no more available system resources (they are gone by cifs-vfs)
they have no more available file-descriptions,
so all process are hanging, waiting (system wait is about 80 till 99%) for
free'd FDs.


But it MUST be a infinite loop in the cifs-client code.

I have to hard reboot the hosts, cause all services like ssh are not responding
anymore.

To find the faulty part I have changed the kernel,compiled a new kernel
(2.6.15.4)  changing the debug options for  kernel AND/OR for cifs.
Setting file-max to an higher value, testing with smbfs (not working for us)
Trying also cifs Version 1.40 and so on...

The system freeze happened randomized everytime (about between one day and a week)

I discovered some things:

1) If host (a) does a /usr/bin/convert over cifs to the server and the other
host (b) reads/writes in the same time to the same file on the server - crash

2) if I enable cifsFYI (echo 1 > /proc/fs/cifs/cifsFYI) the clients are NOT
crashing anymore (uptime now over 18days !!!) disable it will and up in a crash
SOON.

3) in some freezes I find the following messages (/var/log/messages)
VFS: file-max reached !
Hint: I strac'd, gdb'd and valgrind'd the php-applications (which are doing a
lot of fileoperation over CIFS)
and the only thing I could find/see was the collisions of to simultanous access
to the same file over the cifs-server - one of the cifs-clients ends up in a
freeze (must reboot the host )
In two of these cases I could see rising the file-max value, which in normal has
about 2200 till 3500 used filedescriptores, in 1 or 2 seconds from the normal
(ca 3000) to up to 102786 !!!!
I could not kill nor delete as fast as it was wasting all fds.

4) If I do an /bin/du -h /path/to/cifs/share
I got sooner or later no response anymore; the kernel log on the samba server shows:
<snip>
fs/cifs/connect.c: rfc1002 length 0x85000004)
 fs/cifs/connect.c: rfc1002 length 0x85000004)
 fs/cifs/connect.c: rfc1002 length 0x85000004)
 fs/cifs/connect.c: rfc1002 length 0x85000004)
 ....
</snipped>


So I am currently running the hosts with cifsFYI enabled and I have no crashes yet.

Any help would be appreciate.

Sincerely and thx-in-advance
Chris



-- 
"With sufficient thrust, pigs fly just fine."  -- RFC 1925
:wq!


More information about the linux-cifs-client mailing list