management of Samba4

Sun Jun 5 23:42:08 GMT 2005

On Fri, Jun 03, 2005 at 09:35:38AM -0400, David Collier-Brown wrote:
> James Peach wrote:
> >Hmm .. I'm thinking twice about this now. Say we have 1000 smbds
> >running. We have a monitoring app that is sampling performance
> >metrics every second. If one of the values the app is calculating is the
> >total number of SMB packets, would this require 1000 round trips per
> >second? If so, what happens when the app needs more metrics?
> >
> >In this proposal, each time the the app requests a metric, an smbd has
> >to stop serving clients (briefly) to service the request. A couple of
> >these now and then wouldn't matter, but regular sampling of large
> >numbers of metrics or large numbers of smbds might be a problem.  The
> >cost of sampling the metrics with the scheme is proportional to the
> >sampling rate and the number of metrics being sampled (O(n^2)??).
> 
> 	Yes, although it's more the rate times the number of smbds sampled,
> 	more like c*n^2, where c is the constant cost of collecting a
> 	vector of metrics to deliver.  It indeed scales rather poorly
> 
> 	You really need to accumulate the data in a low-cost
> 	way, and then deliver it cheaply, at a frequency about
> 	twice that of the sample period desired.  This implies
> 	rolling up the data, which is computationally expensive,
> 	and is best left in the monitoring program, not the app
> >
> >If performance metrics are enabled, each smbd will have to gather the
> >data irrespective of whether any sampling was occurring. This is a
> >fixed cost, but for simple counters, it's probably not very high. In a
> >scheme like a shared memory segment or shared mmap, samba does not pay
> >the cost of sampling the metrics, only the cost of gathering them
> >(which is a fixed cost however you arrange your metrics).
> 	
> 	The traditional approach is to have a bunch of free-running
> 	counters, and have the monitor sample the values and
> 	subtract to get the <whatever>/second values. This is order n.
> 	
>        Alas, this doesn't work well with units involving time or 
> 	percentages: you have to do a rolling average in
> 	the program being observed to get useful values for 
> 	the monitor. \

All these issues are the kind of stuff PCP deals with. In the PCP
model, the agent exports raw counters with defined sematics. Semantics
might be something like bytes per second for throughputs or seconds per
seconds for % CPU time. The client libraries and apps do all the nasty
grovelling like rate conversion, interpolation across the different
sampling rates, etc. As you observe, the monitoring app is the right
place for this sort of complexity. The monitoring app should also be the
one bearing the cost of sampling the data.

-- 
James Peach | jpeach at sgi.com | SGI Australian Software Group
I don't speak for SGI.