fix to util_sock.c

Mon Nov 13 07:30:41 GMT 2000

Dear Jeremy,

>>>>> "JA" == Jeremy Allison <jeremy at valinux.com> writes:
JA> We actually used to use this instead of read() on sockets,
JA> but one of the earlier versions of Solaris I recall has a
JA> horrible bug in recv which causes this to fail. I think
JA> FreeBSD at some point had a problem with this also.

Can you give me what that bug really was? Or where to see?
For all I've heard was that there was no support for MSG_WAITALL.

And if it's only that, then all we have to do is keep

recieved	= 0;
while( required - recieved ) {
       ssize_t	newrecved;

       newrecved	= recv(.....,MSG_WAITALL );
       if ( newrecved < 0 ) {
	  do the error handling
       }
       recieved	+= newrecved;
}

kind of while() statement aronud recv(), which I did not change for
my code. 

# Since SSL_read() can't pass MSG_WAITALL anyway, I left this
# while() routines, as is.

JA> Whilst these systems are still being used out there it is
JA> safer to use read(), as when we tested this change for performance
JA> it did not improve us noticably, and caused many users with
JA> the broken systems many problems.

I can't really understand ... for read() will call same routine as
recv() will. So, the bug can't be of anything to do with actual
tcp/ip packet handling etc. It must be something to do with
interfaces.

Can someone give me the following information?

1) what was this recv() bug, really????
2) does the same bug occur for recvmsg() call???
   If it doesn't, then all I have to do is switch from
   recv() to recvmsg().

JA> However, for performance fixes, rather than internal code
JA> cleanup, I'd much prefer people to do actual profiling and
JA> report "with this patch I see XX% speedup on these actions",
JA> rather than guessing what will improve things. I've done that
JA> myself in the past, and my guesses like that have usually
JA> been wrong :-) :-). With the actual report we will have at
JA> least an idea of what to expect the changes rather than a
JA> theoretical speedup.

I disagree with this opinion.

In many cases, measuring speedup is somthing of hard. I know there
are many points that will cause speedup for small amount of clients,
but will cause terrible system unstability when number of clients
arises. This is when Speedup is being done using more CPU resources.

If we try to change against this kind of sysetm unstability, what
we'll get is "SYSTEM STABILITY", performance will not gain, we
usually loose it, but this is also important.

# I had to use my 'patched' Linux to count up numer of system calls,
# for smbd fork() too much ( it duplicates number of calls
# information too, you know ). 

By defining scale, optimization will become more clear. But, we
should always think about optimization that freed resource became
spent by system.  Giving more CPU/memory resources to system will
cause TOTAL stability arising, but this is hard to make quantitative
analysis.

I agree that if you have scale, it'll be nice to have result with
it, but this does not mean we should make it MUST. Also, very fact
that we have quantitative result with the patch does not mean we
should select that patch.

So, rather then relying to quantity, we should focus on THE WILL OF
THE PATCH.

best regards,
---- 
Kenichi Okuyama at Tokyo Research Lab, IBM-Japan, Co.