[Samba] Troubleshooting poor (small) random read performance -- serverid.tdb?

Ray Van Dolson rvandolson at esri.com
Thu Jun 19 12:52:12 MDT 2014


On Thu, Jun 19, 2014 at 11:47:34AM +0200, Volker Lendecke wrote:
> On Wed, Jun 18, 2014 at 09:56:25AM -0700, Ray Van Dolson wrote:
> > No apparent change in behavior and strace on the smbd processes still
> > showed it spending most of its time dealing with the serverid.tdb file
> > -- high CPU time for those processes but very little disk I/O
> > resulting.  I'm guessing even if I moved the TDB files to an SSD things
> > wouldn't improve drastically.
> 
> Right. This is not about disk access.
> 
> > At this point we're moving to RHEL 7.0 as it includes Samba 4.1 (would
> > like to avoid a source-install if I can and RHEL6 only includes Samba
> > 4.0.x).  If things work better, perhaps we could push Red Hat to
> > backport the fixes you mentioned in Samba-master to their package?
> 
> Well, that's a bit of a stretch I guess. I doubt RedHat will
> port back those significant changes.
> 
> www.enterprisesamba.com have binaries for older RHEL
> versions. If you have a spare test machine, this could work
> for you even without RHEL 7.
> 
> Volker

Shoot, wish I'd done some Googling and stumbled across that prior to
bumping up to RHEL7!

With that said, here are a couple of updates:

All configs are using "default" options plus these:

    aio read size = 1
    aio write size = 1
    strict locking = no
    use sendfile = yes

- RHEL 7.0 with the stock Samba 4.1.1 - no significant improvement
  observed on the application side.  smbd processes still showed very
  high CPU utilization, system load went up to 10 but minimal disk
  activity seen.  strace on smbd processes looked significantly
  different from what I saw on 3.6.x.  My fu wasn't strong enough to
  discern where most time was being spent.

- RHEL 7.0 with Sama 4.1.8 stolen from Fedora 20 - definite improvement
  observed.  System load stayed around 1.0 or lower and on the
  application side we saw processing occurring at a pretty good clip,
  though not as good as Windows.  There are parent processing nodes and
  children processing nodes... some of the child nodes didn't get fully
  "spun up" and were complaining about being unable to access some of
  the data they needed.... which led us to....

- RHEL 7.0 with Samba 4.1.8 and "locking = no" set on the share from
  which data is being read.  Best results by far -- basically matched
  Windows performance.  All parent and "child" nodes had their CPU's
  fully maxed out indicating they were able to get at the data they
  needed quickly.

I'm still short on having a complete grasp of what's happening on the
application side (the workload here is essentially creating GIS/imagery
"cache" in a distributed fashion), but it's clear that the combination
of Samba 4.1.8 and locking = no helped tremendously.

We'll probably continue to run in this configuration until
enterprisesamba.com releases packages for RHEL7.

Interested in your feedback on our observations.  Would like to feed
our findings to our teams here so our customers wanting to use Samba
rather than Windows can benefit.

Thanks,
Ray


More information about the samba mailing list