[Samba] Troubleshooting poor (small) random read performance -- serverid.tdb?

Ray Van Dolson rvandolson at esri.com
Wed Jun 18 10:56:25 MDT 2014


On Wed, Jun 18, 2014 at 09:29:12AM +0200, Volker Lendecke wrote:
> On Tue, Jun 17, 2014 at 10:27:38PM -0700, Ray Van Dolson wrote:
> > Hi everyone;
> > 
> > Have a Windows 2012 based workload that generates many (20K+ PPS) small
> > reads (Wireshark tells me the individual packets are typically less
> > than 200 bytes and are Read AndX Request) from eight or so hosts to a
> > Samba server running on RHEL 6.5.  Samba version is the latest Red Hat
> > provided 3.6.9 version.
> > 
> > Things are going quite a bit slower than expected (as compared to the
> > same workload pointed at a Windows 2012 server).
> > 
> > iostat shows not much disk IO activity or iowait going on, but smbd
> > processes are all maxed out CPU wise.  An strace on them shows *lots*
> > of activity (almost exclusively) to the serverid.tdb file:
> > 
> > fcntl(16, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=636, len=1}) = 0
> > fcntl(16, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=636, len=1}) = 0
> > 
> > File handle 16 corresponds with the serverid.tdb file for the PID in
> > question.
> 
> A lot of accesses to serverid.tdb have been eliminated in
> Samba 4.1. In Samba master, we eliminated the requirement to
> do fcntls altogether by introducing mutex support to tdb.
> 
> > 
> > Have been searching around trying to find out what exactly it is that
> > the serverid.tdb file is used for but hvaen't found a great
> > explanation.  Am wondering if it has something to do with my use of
> > security ads and winbind?
> 
> No, that's not the cause.
> 
> > I'm hoping that if I can find a way to eliminate or optimize all of
> > this activity I can get better performance out of Samba and avoid
> > needing to shift to Windows.
> 
> If your workload does not do byte range locks and does not
> depend on those, you might give "strict locking = no" a
> quick try.
> 
> > Possibly Samba 4.x would work better?  Haven't yet tried.
> 
> Hopefully 4.1 would be better. Also, depending on the
> workload, enabling async I/O (aio read size = 1, aio write
> size = 1) might also improve things significantly.
> 
> > Also am unsure if this random read workload with very small
> > transactions will work well out of the box with Samba.
> > 
> > My config:
> > 
> > [global]
> >     workgroup = WORKGROUP
> >     password server = server, *
> >     realm = realm.com
> >     security = ads
> >     idmap uid = 10000-19999
> >     idmap gid = 10000-19999
> >     idmap config WORKGROUP:backend = rid
> >     idmap config WORKGROUP:range = 10000000-19999999
> >     template shell = /bin/bash
> >     winbind enum users = no
> >     winbind enum groups = no
> >     winbind separator = +
> >     winbind use default domain = yes
> >     winbind normalize names = yes
> >     template homedir = /home/%D/%U
> >     template shell = /bin/bash
> >     server string = Samba Server Version %v
> >     log file = /var/log/samba/log.%m
> >     log level= 1
> >     max log size = 50
> >     passdb backend = tdbsam
> >     load printers = no
> >     cups options = raw
> > 
> >     socket options = TCP_NODELAY SO_KEEPALIVE SO_RCVBUF=131072 SO_SNDBUF=131072 IPTOS_LOWDELAY
> 
> Please remove these settings if you are not running ancient
> AIX or so. Modern kernels are good at figuring out those
> settings themselves. In particular the SNDBUV/RCVBUF
> settings can cause harm.
> 
> If you are in the position to try 4.1, I would suggest to do
> so. 4.1 also has proper support for SMB2, which might also
> have a huge effect on overall performance.
> 
> Please keep us posted about your results. If nothing helps,
> don't give up but provide us with more input please :-)
> 
> Volker

Thanks for the reply.  Have tried enabling async I/O as well as
disabling strict locking (all of this on 3.6.9).  Also removed the
socket options per your suggestion. 

No apparent change in behavior and strace on the smbd processes still
showed it spending most of its time dealing with the serverid.tdb file
-- high CPU time for those processes but very little disk I/O
resulting.  I'm guessing even if I moved the TDB files to an SSD things
wouldn't improve drastically.

At this point we're moving to RHEL 7.0 as it includes Samba 4.1 (would
like to avoid a source-install if I can and RHEL6 only includes Samba
4.0.x).  If things work better, perhaps we could push Red Hat to
backport the fixes you mentioned in Samba-master to their package?

Thanks,
Ray


More information about the samba mailing list