Eliminating gettimeofday from construct_reply
rsharpe at ns.aus.com
Thu Aug 1 08:18:43 GMT 2002
On Thu, 1 Aug 2002, Green, Paul wrote:
> Andrew Bartlett [mailto:abartlet at samba.org] wrote:
> > Richard Sharpe wrote:
> > >
> > > Hi,
> > >
> > > I looked at this issue, and it looks possible to accumulate
> > > the timeouts that have occured in receive_message_or_smb and
> > > count those up.
> > >
> > > Given that the resolution of the dead time parameter is in
> > > minutes, this would seem to not get too far out of whack.
> > Sounds like a fair optimization to me - assuming that's what
> > it is for.
> Pardon my skepticism, but is there any evidence that calling gettimeofday
> from this location in the code is actually contributing in any material way
> to the performance of Samba? Any measurements? If there isn't, then you
> are just optimizing based on by guess and by golly, and could be (almost
> certainly will be) introducing a maintenance headache, or an unwitting
> platform dependency, by trying to second-guess them. You could also make
> operating Samba from within a debugger (or during profiling) rather touchy,
> since your "accumulating" probably won't work in the face of breakpoints.
> If this code was mine, I'd insist that you prove to me that this change
> would result in at least a 3% or more gain in performance. I seriously doubt
> whether you could reach this bar. Why? Well, in all probability the TCP
> stack does multiple time calls all on its own for every packet, and the cost
> of sending the data probably far outweighs the cost of reading the clock, so
> I think this time call is lost in the noise.
OK, I think you are wise to be skeptical :-)
I would only point out that while I have not profiled Samba in that area,
and it would not be hard to do (perhaps a project for today as I will be
adding trace points for our internal tracing tool), there are two salient
points to be added:
1. The time calls in the stack are in the kernel and, while I haven't
checked, on FreeBSD possibly get the internal processor cycle count
scaled, which is a much cheaper operation.
2. We converted a lot of tracing code to using cycle counts recently
rather than calling gettimeofday, because, guess what, all those system
calls were having a big impact.
3. (Nobody needs a Spanish Inqusition) When I was profiling the sendfile
patches recently, the stat call in userland (instead of being in sendfile)
made a measurable difference, despite the fact that the vnode/inode should
already have been cached! I would claim that gettimeofday would introduce
a comparable cost to stating a file that was recently opened and thus had
its vnode/inode cached.
However, I agree with your comments below on how to go about
> (1) Operating system engineers know that their time routines are going to
> get heavily called and generally try to optimize them. We certainly try on
> our OS. (2) This routine has fairly high resolution. If you don't need
> this level of resolution, you might use time(), which is probably cheaper,
> and certainly no more expensive (and POSIX-compliant, whereas gettimeofday()
> is not in POSIX-96). (3) Optimize based on measurements not reading code.
> Let me just add that I've wasted all too many working days in my career by
> trying to optimize code by inspection. When I actually take the time to run
> a benchmark and then optimize the hot spots, I get much better results in
> much less (human) time.
> Paul Green, Senior Technical Consultant, Stratus Computer, Inc.
> Voice: +1 978-461-7557; FAX: +1 978-461-3610; Video on request.
Richard Sharpe, rsharpe at ns.aus.com, rsharpe at samba.org,
sharpe at ethereal.com
More information about the samba-technical