Eliminating gettimeofday from construct_reply

Thu Aug 1 08:18:43 GMT 2002

On Thu, 1 Aug 2002, Green, Paul wrote:

> Andrew Bartlett [mailto:abartlet at samba.org] wrote:
> > Richard Sharpe wrote:
> > > 
> > > Hi,
> > > 
> > > I looked at this issue, and it looks possible to accumulate 
> > > the timeouts that have occured in receive_message_or_smb and
> > > count those up.
> > > 
> > > Given that the resolution of the dead time parameter is in 
> > > minutes, this would seem to not get too far out of whack.
> > 
> > Sounds like a fair optimization to me - assuming that's what 
> > it is for.
> 
> Pardon my skepticism, but is there any evidence that calling gettimeofday
> from this location in the code is actually contributing in any material way
> to the performance of Samba?  Any measurements?  If there isn't, then you
> are just optimizing based on by guess and by golly, and could be (almost
> certainly will be) introducing a maintenance headache, or an unwitting
> platform dependency, by trying to second-guess them.  You could also make
> operating Samba from within a debugger (or during profiling) rather touchy,
> since your "accumulating" probably won't work in the face of breakpoints.
> If this code was mine, I'd insist that you prove to me that this change
> would result in at least a 3% or more gain in performance. I seriously doubt
> whether you could reach this bar. Why?  Well, in all probability the TCP
> stack does multiple time calls all on its own for every packet, and the cost
> of sending the data probably far outweighs the cost of reading the clock, so
> I think this time call is lost in the noise.

OK, I think you are wise to be skeptical :-)

I would only point out that while I have not profiled Samba in that area, 
and it would not be hard to do (perhaps a project for today as I will be 
adding trace points for our internal tracing tool), there are two salient 
points to be added:

1. The time calls in the stack are in the kernel and, while I haven't 
checked, on FreeBSD possibly get the internal processor cycle count 
scaled, which is a much cheaper operation.

2. We converted a lot of tracing code to using cycle counts recently 
rather than calling gettimeofday, because, guess what, all those system 
calls were having a big impact.

3. (Nobody needs a Spanish Inqusition) When I was profiling the sendfile 
patches recently, the stat call in userland (instead of being in sendfile) 
made a measurable difference, despite the fact that the vnode/inode should 
already have been cached! I would claim that gettimeofday would introduce 
a comparable cost to stating a file that was recently opened and thus had 
its vnode/inode cached.

However, I agree with your comments below on how to go about 
optimizations.

> <soapbox>
> 
> (1) Operating system engineers know that their time routines are going to
> get heavily called and generally try to optimize them.  We certainly try on
> our OS.  (2) This routine has fairly high resolution.  If you don't need
> this level of resolution, you might use time(), which is probably cheaper,
> and certainly no more expensive (and POSIX-compliant, whereas gettimeofday()
> is not in POSIX-96).  (3) Optimize based on measurements not reading code.
> 
> </soapbox>
> 
> Let me just add that I've wasted all too many working days in my career by
> trying to optimize code by inspection.  When I actually take the time to run
> a benchmark and then optimize the hot spots, I get much better results in
> much less (human) time.
> 
> Thanks.
> PG
> --
> Paul Green, Senior Technical Consultant, Stratus Computer, Inc.
> Voice: +1 978-461-7557; FAX: +1 978-461-3610; Video on request.
> 
> 

-- 
Regards
-----
Richard Sharpe, rsharpe at ns.aus.com, rsharpe at samba.org, 
sharpe at ethereal.com