npower-python3-gpo-fix review

Fri Oct 26 08:17:40 UTC 2018

On 26/10/2018 02:48, Douglas Bagnall wrote:
> We are talking about this:
> https://gitlab.com/samba-team/samba/merge_requests/69.patch
>
>>  From 574193df283e73311f470c2c7ec4414048a4321f Mon Sep 17 00:00:00 2001
>> From: Noel Power <noel.power at suse.com>
>> Date: Wed, 5 Sep 2018 12:36:00 +0100
>> Subject: [PATCH 01/11] python/samba/gp_parse: PY2/PY3 compat changes for
>>   __init__.py
>>
>> Fixes.
>>
>> 1) sorting of xml.etree.ElementTree.Element, in PY2 sort
>>     seems to sort lists of these. In PY3 this no longer works.
>>     Choosing tag as the sort key for py3 so at least in python3
>>     there is a consistent sort (probably won't match how it is
>>     sorted in PY2 but nothing seems to depend on that)
>> 2) md5 requires bytes
>> 3) tostring returns bytes in PY3, adjust code for that
>>
>> Signed-off-by: Noel Power <noel.power at suse.com>
>> ---
>
>> @@ -171,7 +172,7 @@ class GPParser(object):
>>           output_xml = tostring(root)
>>   
>>           for ent in entities:
>> -            output_xml = output_xml.replace(ent[0].replace('&', '&'), ent[0])
>> +            output_xml = output_xml.replace(get_bytes(ent[0]).replace(b'&', b'&'), get_bytes(ent[0]))
>>   
> Here I would prefer something like
>
>           for ent in entities:
> -            output_xml = output_xml.replace(get_bytes(ent[0]).replace(b'&', b'&'), get_bytes(ent[0]))
> +            entb = get_bytes(ent[0])
> +            output_xml = output_xml.replace(entb.replace(b'&', b'&'), entb)
>
> (line length, simplicity).
yeah that's fine with me
>
>
>> Subject: [PATCH 03/11] python/samba/gp_parse: PY3 open file non-binary mode
>>   for write_binary
>>
>> Although this is unintuitive it's because we are writing unicode
>> not bytes (both in PY2 & PY3). using the 'b' mode causes an error in
>> PY3
>> Signed-off-by: Noel Power <noel.power at suse.com>
>> ---
>>   python/samba/gp_parse/gp_csv.py | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/python/samba/gp_parse/gp_csv.py b/python/samba/gp_parse/gp_csv.py
>> index cd30ef2bc00..e9f5a14a4d1 100644
>> --- a/python/samba/gp_parse/gp_csv.py
>> +++ b/python/samba/gp_parse/gp_csv.py
>> @@ -93,7 +93,7 @@ class GPAuditCsvParser(GPParser):
>>                   self.lines.append(line)
>>   
>>       def write_binary(self, filename):
>> -        with open(filename, 'wb') as f:
>> +        with open(filename, 'w') as f:
>>               # This should be using a unicode writer, but it seems to be in the
>>               # right encoding at least by default.
>>               #
> OK, yes, it is unintuitive. It might also be wrong for some people.
>
> https://docs.python.org/3/library/functions.html#open says:
>
> "In text mode, if encoding is not specified the encoding used is platform dependent:
> locale.getpreferredencoding(False) is called to get the current locale encoding"
>
> In Python 3 we can do this:
>
>       def write_binary(self, filename):
> -        with open(filename, 'wb') as f:
> -            # This should be using a unicode writer, but it seems to be in the
> -            # right encoding at least by default.
> -            #
> -            # writer = UnicodeWriter(f, quoting=csv.QUOTE_MINIMAL)
> +        with open(filename, 'w', encoding=self.output_encoding) as f:
> +            # In this case "binary" means "utf-8", so we let Python do that.
>               writer = csv.writer(f, quoting=csv.QUOTE_MINIMAL)
>               writer.writerow(self.header)
>               for line in self.lines:
>
> but not in 2 (no encoding keyword).
hmm maybe using io.open then
>
> I've pushed a hacky version of this without reviewed-by, and I hope we can
> come up with something better that still allows that lovely
> "remove unused code" patch. Maybe codecs.open() would work here.
>
> (I bet we have no tests for anything for people with non-utf-8 locales).
most likely :-)
>
>
>> Subject: [PATCH 07/11] python/samba/netcmd: misc PY2/PY2 compat changes for
>                                                     ^^^^^^^
> That's an easier target, but let's stick with PY2/PY3.
hehe indeed
>
>> @@ -399,9 +399,9 @@ class cmd_listall(Command):
>>           msg = get_gpo_info(self.samdb, None)
>>   
>>           for m in msg:
>> -            self.outf.write("GPO          : %s\n" % m['name'][0])
>> -            self.outf.write("display name : %s\n" % m['displayName'][0])
>> -            self.outf.write("path         : %s\n" % m['gPCFileSysPath'][0])
>> +            self.outf.write("GPO          : %s\n" % str(m['name'][0]))
>> +            self.outf.write("display name : %s\n" % str(m['displayName'][0]))
>> +            self.outf.write("path         : %s\n" % str(m['gPCFileSysPath'][0]))
>>               self.outf.write("dn           : %s\n" % m.dn)
>
> I know I should know this by now, but why is this necessary?
> Doesn't '"%s" % x' call 'x.__str__' just the same as 'str(x)'?
I confess I don't know whether I had a reason for doing this  or whether 
it was habit, you would *think* str(m['name'][0]) etc, would get 
automagically called, I'll have a look later
>
> I have pushed a slightly modified, rebased, mostly RB+ed version to
> https://gitlab.com/samba-team/devel/samba/commits/dbagnall-npower-python3-gpo-fix

thanks for looking, be good to get something in here as hopefully we 
close in on a pure python3 build

Noel