Another showstopper in 2.2.5

Pascal pascal at vmfacility.fr
Tue Aug 13 00:44:01 GMT 2002


Le Lundi 12 Août 2002 21:17, Fredrik Ohrn a écrit :
> On Mon, 12 Aug 2002, Fredrik Ohrn wrote:
> > Ok, here's another showstopper I have started to get alot over the last
> > week.
> >
> > Tonight I'll see if I can figure out a way to force the problem and use
> > the panic action to get more clues. Does anyone have a good recepie for
> > 'panic action' that just dumps a backtrace into a logfile?
>
> OK, I've done some more tinkering.
>
> I havn't found a better way to trigger this then to open a Word doc, type
> some, save, sit around, type some, save, sit around. It takes patience...
>
> After a while the save fails and Word pops up the following mesage:
>
>   The save failed due to out of memory or disk space.
>   (F:\foobar\~WRL0005.TMP)
>
>
>
> In this example the initial smbd has pid 19957, I have rasied it's debug
> level to 10 with smbcontrol, se the attached logfile.
>
> I did the save a Word document dance. Suddenly it freezes for about 30
> seconds, then it pops up the mesage.
>
> A quick look with smbstatus:
>
> [root at olivia bin]# ./smbstatus | grep ohrn
> ohrn         ohrn     staff    20033   canine   (129.16.214.84) Mon Aug 12
> 14:43:28 2002 19957  DENY_NONE  0x2019f     RDWR       NONE            
> /olivia/home3/sys/ohrn/foobar/~WRL0005.tmp   Mon Aug 12 14:43:26 2002
>
>
> Notice that I am now served by pid 20033 and pid 19957 is dead.
>
> Apparently it's pid 20033 that refuses the save, unfortunatley I don't
> have any logs since the debug level is reset back to 0. Tomorrow night
> I'll change it directly in the smb.conf and redo the experiment.
>
>
> The next observation is that pid 19957 didn't crash, atleast not in a way
> that is caught by 'panic action'. I have double checked my gdb recepie
> by manually doing kill -SEGV and that works.
>
>
> Options if interest are: oplocks = no, deadtime = 0, keepalive = 30
>
>
> A correction of the topic, it's v2.2.6-pre1, I forgot that it was still
> up and running.
>
>
> This is on a linux box running kernel v2.4.19 in case it's of any
> importance.
>
>
> The client is running Windows XP.
>
>
> So, what do I do next?
>
>
> Regards,
> Fredrik

I'd do an [strace -p PID]  on the smbd processes at time of the freeze of word
in order to know what the process is doing
then a [gdb -p PID]  and the command bt (backtrace) to know more precisely 
where it is in the code.

if it is a timing problem, tracelevel 10 may fail to reproduce the freeze 
because of the overload involved.

Pascal



More information about the samba-technical mailing list