[Samba] Bulk smbcacls calls
info at whywouldwe.com
Wed Dec 11 03:34:28 MST 2013
That sounds like a great addition to smbcacls.
We played around with the source of the of sbmcacls and found that a lot
of the time is spent converting the numeric user/group ids to their
human equivalents. eg if 1 file has 3 users and 5 groups there's 8
requests to resolve the numeric user/group ids (1 request per conversion
if I recall correctly), so we realised by repeatedly calling smbcacls
use we were effectively looking up the same groups multiple times, we
needed to cache the lookup results.
Then we found that we could get the acls with numeric output, similar
output to smbcacls but without the conversion to human form, with latest
version of pysmbc from https://git.fedorahosted.org/cgit/pysmbc.git/
(the latest commits added the functionality we wanted which wasn't in
the version from pypi). To get the human user/group representation we
use smbcacls and parse the output and store in a numeric -> human map so
we only make max 1 request per new user/group encountered (a bit hackish
but it works for us). It would be good to be able to make the same
lookup request that smbcacls makes to resolve a user/group id in python,
it would be a useful addition to pysmbc if the data is available from
libsmbclient. By doing it this way we've found that we can process
200-300 files per second in our setup (approx 13,000 files, not sure how
We scan to get all the individual file objects into our database then
make 1 request per file to get the acls, using a recursive version of
smbcalcs and matching files in the output back to those in our db would
be awkward in our situation, especially if files are added or removed in
the period between the scan and recursive smbcacls call.
I welcome any comments regarding our approach.
I'll give your new version of smbc a go this afternoon if I get a chance.
On 11/12/2013 09:44, Noel Power wrote:
> On 29/11/13 10:05, Noel Power wrote:
>>>> Failing that, I assume that much of the time taken is spent on
>>>> authenticating the user/pw for each request. Would it be possible to
>>>> write something that keeps the connection so that multiple requests can
>>>> be made without reauthenticating (I'm not familiar with how
>>>> LDAP/AD/Samba works)? I have looked at the source of smbcacls but
>>>> nothing jumped out at me.
>>> Noel (cc'ed) just finished a bunch of changes adding inheritance
>>> propagation to smbcacls:
>>> It doesn't do recursion on --get, but the new code could certainly be
>>> leveraged to add this feature.
>> Yes unfortunately support for '--get' in a recursive fashion isn't
>> currently supported, it didn't quite fit with the inheritance
>> propagation feature, but... David is correct in that the new code could
>> definitely be leveraged to do that. However something like what you
>> require I believe needs another new cmdline switch e.g. something -like
>> maybe "-r|-recursive" . Also I wonder how the output of smbcacls
>> should look for such an operation as the existing output doesn't
>> actually mention the file/dir that is being processed.
>>  in general I do believe a pure recursive option could be useful ( at
>> least for --get, --chown, --chgrp )
> if you are up to building from source you could try my repo
> I've added there a '-r' switch.
> With that version built from source you can use '-r' with the new
> '--propagate-inheritance' switch e.g.
> smbcalcs -r --propagate-inheritance --add|--modify|--set|--delete
> you can additionally use '-r' with the following operations --get,
> --chown & --chgrp
> note: you can't use -r with ' --add|--modify|--set|--delete' alone, if
> you want to use '-r' with (add/modify/delete...) you *must* additionally
> specify '--propagate-inheritance'
> I did some very very rough performance testing with a linux host running
> kvm with a winserver 2012 guest
> smbcalcs -r --get -Uusername%password //guestIP/share /testdir > /dev/null
> returns in < 2 minutes ( testdir has 20,069 files in 2,842 directories )
More information about the samba