Samba + e2compr = problems?
Paulo Afonso Graner Fessel
pafessel at netsol.com.br
Sat Dec 18 13:50:53 GMT 1999
Casually I've got the chanceto reproduce at my workstation the
problem that prevented us to change from Novell 4.11 to Samba on RH6.0.
AFAIK, the problem is loosely related to the one described in these lists
where one get a bunch of smbd daemons running, while certain files remain
locked and can be only unlocked by means of a reboot of the system.
I could verify details on the problem by running NAI sniffer at my
workstation while making the problem happen. It's reproductible 100% of
times I try to do it and it happens in the following conditions:
1) I open the file (a ~800KB Power Point presentation) from a
local disk and I save it on the network. The sniffer shows me that
everything goes all right, sending SMB commands to open and close files. I
notice that NBT data blocks are sized 1460 bytes, and the window size of
the TCP connection is 8760 bytes.
2) When I try to save the file again, 2-3 minutes later regardless
of the existence of modifications, the workstation fails to save the file.
Aditionally, the NT WKS shows a "System Process Dialog" stating that there
was a problem while saving data buffers to the file and that it may be
corrupted. (My NT WKS is Portuguese-Brazilian so I don't know what is
exactly the message in english.)
3) When I get this failure, the smbd that was serving the
connection gets mad and enters "R" state, making itself unkill-
able, un-reniceable, locking the file that was being saved and
/usr/sbin/smbd file itself. Also, 2 or 3 more smbd daemons are
launched, and nevertheless the workstation cannot contact the Samba server
anymore. The only way to make things work again is either rebooting
station or the server - I mean the entire machine and not only the daemons
4) Looking to the network traffic captured by the sniffer, I note
that the initial stages of the saving are accomplished without problems.
But there is a difference: the window size of the connection is larger
than the size in the first and successful connection (14600 bytes for a
2.2.13 kernel without the "Allow large windows" parameter active, and even
30660 bytes with the same parameter active!)
5) In the second save, data is written ok to the network until a
certain point, when in some ack the window size of the connection begins
to shrink until it reaches 0! This is a window frozen problem, and
afterwards all sorts of problems happen, until the workstation sends a RST
to the server, which is immediately accomplished leading to the
estabilishment of a new NBT session with the workstation.
The problem is locking-independent, as the trace done with the
sniffer has already shown. Conservative options like "use strict locking"
make no difference.
Someone has a clue about the shrinking of the window size only in
the *second time* the file is saved? Also, why in the first save the
window size is set to 8760 bytes (the default value of the window size in
MS's TCP stack) and is set to 14600 or 30660 in the second save?
Could it be related to some bug in e2compr? (This message is
being sent to the e2compr list because of obvious reasons.) I'm asking
this because I know that you need to build a table to gzip the file
on-the-fly, and so this could limit the memory available to the TCP/IP
stack of the server in case you're saving big files simultaneously. One
possibility I'm considering is turning off compression in my home
directory and try to do the same tests again.
Does anybody in the list uses Samba and e2compr togheter? I'm
running Red Hat 6.0 w/latest patches, Samba 2.0.6 prebuilt RPM, 2.2.13
vanilla-customized kernel with latest e2compr patch. I'm running this
stuff in a NetFinity 3500 server, with a Mylex PCI RAID controller.
Thanks for any help,
More information about the samba-technical