fix to util_sock.c

Kenichi Okuyama okuyamak at dd.iij4u.or.jp
Tue Nov 14 01:57:27 GMT 2000


Dear Jeremy,

>>>>> "JA" == Jeremy Allison <jeremy at valinux.com> writes:
>> And HOW was it broken? Simply does not work?
>> Or does not work on specific cases? Simply some of the flags was
>> undefined?
JA> Both. Just didn't work on Solaris, missing flags on other
JA> UNIXes (that's not a problem of course, you just #define MSG_WAITALL
JA> to be zero).
JA> Once we tested the performance implications of the change
JA> and found it made no measurable difference, we went with
JA> the simplest working code using read().

You can't find anything different because you are looking at wrong
thing. As I told you before, this will not effect Samba's
performance. This will effect system's load.

Here is what I did, and what I found (not something that I guessed).


It is true that samba will react just as if no change have been
made. I already gave explain about this, but to repeat, it's because
samba will not start moving until very last packet arrive.
# This design itself is needed to be tuned by using Out Of Orderness,
# but since this is another story, I won't go further.

Because most of the cost is payed at kernel level ( because it's
system call optimization ), user time will not change too.  Because
this is "AT REALTIME" status, average usage of system seems as if no
change occur, too, even if you look at vmstat result.


But even though, this code is lot ... I don't really know good word
against this ... "GENTLE"? ... to your system. Realtime ability
required by device driver have lot better chance to meet. To find
this out, I had to use my patched Linux.

I used server of 500MHz Pentium-III +256Mbyte of ram, which was
Netfinity. I also used 16 Compac machine for client, and run
netbench. 16 Compac machine was connected to Laneed EtherHub with
100base-TX line, and server was connected with 100base-TX too.
# Netfinity use AMD's 100base on board chip.
# And if you know this, this chip causes tremendeus amount of
# interruptions, if you use device driver coming with Linux
# 2.4.0-test8.

Linux version was 2.4.0-test8, but I saw the samething on
2.3.99-pre9, and 2.2.16, too.



What happened was that with original Samba, "TX FIFO ERROR!" occured
very often, I lost counting but at least more then 20 times within
11 minutes. This error occur when kmalloc() of Linux failes to
create new recv buffer inside interruption handler. This means,
recieve buffer creator creates buffer faster than system pulling
it out and loading them into TCP buffer.

With this patched Samba, "TX FIFO ERROR!" never occur.  With
'MSG_WAITALL' option, recv() system call now do not have to go back
and forth around the user and kernel space, but instead most of the
process stays in kernel space . And for this reason, Context switch
overhead ( not only against process<->kernel, but ones that occur
when Hardware interruption caused while running process ) is being
less. And since so, we can now pull out packet at speed required.


As you can see, what you'll earn from this patch does not reflect
samba's performance in any meaning. As long as you focus on Samba
process, you'll see nothing.

It effects system load. And that, at critical moment. If your system
have enough CPU resource, you'll not be able to detect this. Or, if
you don't have lots of clients with heavy load, you'll not detect
this, too. But this does goods to system, and that at most critical
moment. And that's where you'll get 'Last drop' from your system.

And for this reason, I recommend to use this patch.


>> By the way, If buggy target is KNOWN, or if non-buggy target is
>> KNOWN, then we can always switch it back to read() by used something
>> like:
>> 
>> #ifdef HAVE_WORKING_RECV
>> #define   RECV(a,b,c,d)  recv(a,b,c,d)
>> #else
>> #define RECV(a,b,c,d)   read(a,b,c)
>> #endif
>> 
>> kind of macro.
>> 
>> And by defining HAVE_WORKING_RECV as default, we can always
>> find one who's not working.
JA> This is only worthwhile if we have evidence that changing to 
JA> use recv() gives us an advantage (maybe performace) of some kind.

I do think this is worthwhile, as I described.



>> # I experenced read() that did not work for socket more oftenly than
>> # non working recv(), is the reason.
JA> The experience of Samba on all the supported platforms
JA> has been the reverse. Can you be specific on which platforms
JA> read() on sockets is broken ? I have never come accros such
JA> a platform.

1) Very old VxWorks ( I lost version number ).
2) Phoenix OS that I've tested 4 years ago.
   ( was it because of demonstration version? )
3) Several IBM internal embedded OS.
   # Sad to report, this includes one that I made too ... (T-T)
   # Don't worry, none of them is being sold.
4) AIX for i386, I lost version number too, but I thought it was
   around 1.2. I don't think this exists now as server. It was 9
   years ago, and not being supported by IBM anymore.
   I'd recommend using FreeBSD for those machines that have this OS.

All the informations are of classics, but this kind of ghost comes
to you whenever you forgot existance of them, you know (^^;).

best regards,
---- 
Kenichi Okuyama at Tokyo Research Lab. IBM-Japan, Co.




More information about the samba-technical mailing list