[Samba] Re: Severe problem with Samba

Jeremy Allison jra at samba.org
Wed Feb 6 21:45:03 GMT 2002

On Wed, Feb 06, 2002 at 01:05:41PM -0000, Martin Rootes wrote:
> Jeremy,
> 	further to this e-mail, last wednesday evening (30/01/02) we moved the Samba service to a 
> more powerful machine (a dual 750MHz processor V880, running Solaris 2.8), running the 2.2.3-
> CVS version that I'd downloaded around the 18/12/01. At first everything seemed OK and was 
> running smoothly, however this was during an 'inter semester gap', and so not much real work 
> was going on. Then on Monday (04/02/02), when the students started work again all hell broke 
> loose. We were seeing the same symptoms as previously described but happening far more 
> often . I had to stop and restart Samba about 4 or 5 times to clear the processes out, after re-
> starting the service would be OK for about half an hour and then we'd start getting the build up 
> again. I attempted various configuration changes, removing quota support, removing strict 
> locking (not sure why that was set anyway), setting keepalives to 30, all to no avail. Eventually I 
> compiled and re-installed an old version of Samba (2.0.7) that we had been using successfully in 
> the past and started up the service using this. Since then we have not had any problems, 
> yesterday the number of connections peaked at about 1140, the load on the server being a very 
> respectable 0.4. 

What I really need to see is what the processes are doing when the system is in
this state. So far you are the only site reporting this provlem with the 2.2.3
codebase, and there are several using the 2.2.3 code for just this (Solaris)

DaveCB, can you get some resources assigned to look into this, as we need
to know why Solaris seems to be prone to this problem (could just be the 
fact Solaris is running larger sites, but I don't think so - I know IBM and
HP also have similarly sized large Samba sites and no such problems have
been reported).

When you talk about "the samba symptoms" does this mean run away fcntl
calls ? The 2.2.3 codebase now avoids any traversals of tdb's that could
cause this problem.

> 	I noticed on the Samba pages that version 2.2.3 has now been officially released, has there 
> been any changes to this version since December 18? Is there a known problem with any of the 
> versions post 2.0.7 that might explain these symptoms? Is it simply that each version requires 
> more resources and so increases load on the system to a point that normal operation breaks 
> down?

Unlikely. We need more data on what the system is doing when it
gets into this state. I'm BCC:ing my old friend Andy Bowers at Sun UK
to see if he has any ideas.

> 	Finally, I've just noticed today that you'll actually be in Sheffield on the 21/02/02 for a Linux 
> seminar at Hilsborough and then later this very establishment! I would really appreciate it if we 
> could get together for a chat about the problems we are experiencing, would it be possible to get 
> an hour or so of your time on that day?

I'm love to help out with this problem - *Especially* for a Sheffiled
University. See my other message on my current commitments over this period.


