[PATCH] ms_schema: fix python2.6 incompatibility

Douglas Bagnall douglas.bagnall at catalyst.net.nz
Thu Mar 15 22:54:12 UTC 2018


On 16/03/18 11:20, Bjoern Baumbach wrote:
> On 03/15/2018 09:23 PM, Douglas Bagnall wrote:
>>> -    entry = header + [x for x in entry if x[0].lower() not in {'dn', 'changetype', 'objectcategory'}]
>>> +    entry = header + [x for x in entry if x[0].lower() not in set(['dn', 'changetype', 'objectcategory'])]
>>>  
>> In this case with three members it would be more efficient and 
>> line-length-compliant to just use a list or tuple:
>>
>> +    entry = header + [x for x in entry if x[0].lower() not in ('dn', 'changetype', 'objectcategory')]
> 
> I don't know what is the right way. I've googled that and have read that
> a set() might be faster. Any details are unknown to me.
> According to my tests with {} means a 'set', so i've translated the
> changes to set(), as seen on many other places in the samba code.
> Without google'ing the other options, I would just replace the {} with
> []. But I think this is a challenge for a real python user, who knows
> what he is doing.
> 
> There are also suggested patch attached to the bug report by Alexey
> Vekshin. Please let me know what the correct fix is (and maybe why).

Sorry. They are all correct. I meant it when I said RB+.

Now, having established it doesn't matter at all, since you sort of asked,
there is a simple way to check these things in python:

Python 2.7, with very approximately the same problem:

$ python -m timeit -s 'c = list(range(10))' '[x for x in c if x in {2,5,7}]' 
1000000 loops, best of 3: 1.03 usec per loop
$ python -m timeit -s 'c = list(range(10))' '[x for x in c if x in [2,5,7]]' 
1000000 loops, best of 3: 0.456 usec per loop
$ python -m timeit -s 'c = list(range(10))' '[x for x in c if x in (2,5,7)]' 
1000000 loops, best of 3: 0.457 usec per loop
$ python -m timeit -s 'c = list(range(10))' '[x for x in c if x in set([2,5,7])]' 
100000 loops, best of 3: 2.05 usec per loop

The set([]) is slower because it has to make a list, then a set, then
throw away the list. [] is quick when you have three things, because
it has to do a maximum of three compares. If we had more items we
would see that sets are quicker in general.

Python 3.6:

$ python3 -m timeit -s 'c = list(range(10))' '[x for x in c if x in {2,5,7}]' 
1000000 loops, best of 3: 0.424 usec per loop
$ python3 -m timeit -s 'c = list(range(10))' '[x for x in c if x in [2,5,7]]' 
1000000 loops, best of 3: 0.603 usec per loop
$ python3 -m timeit -s 'c = list(range(10))' '[x for x in c if x in (2,5,7)]' 
1000000 loops, best of 3: 0.606 usec per loop
$ python3 -m timeit -s 'c = list(range(10))' '[x for x in c if x in set([2,5,7])]' 
100000 loops, best of 3: 2.51 usec per loop

The interesting thing here is {} is quickest, which I didn't expect.

Douglas



More information about the samba-technical mailing list