npower-python3-gpo-fix review

Fri Oct 26 01:48:13 UTC 2018

We are talking about this:
https://gitlab.com/samba-team/samba/merge_requests/69.patch

> From 574193df283e73311f470c2c7ec4414048a4321f Mon Sep 17 00:00:00 2001
> From: Noel Power <noel.power at suse.com>
> Date: Wed, 5 Sep 2018 12:36:00 +0100
> Subject: [PATCH 01/11] python/samba/gp_parse: PY2/PY3 compat changes for
>  __init__.py
> 
> Fixes.
> 
> 1) sorting of xml.etree.ElementTree.Element, in PY2 sort
>    seems to sort lists of these. In PY3 this no longer works.
>    Choosing tag as the sort key for py3 so at least in python3
>    there is a consistent sort (probably won't match how it is
>    sorted in PY2 but nothing seems to depend on that)
> 2) md5 requires bytes
> 3) tostring returns bytes in PY3, adjust code for that
> 
> Signed-off-by: Noel Power <noel.power at suse.com>
> ---

> @@ -171,7 +172,7 @@ class GPParser(object):
>          output_xml = tostring(root)
>  
>          for ent in entities:
> -            output_xml = output_xml.replace(ent[0].replace('&', '&'), ent[0])
> +            output_xml = output_xml.replace(get_bytes(ent[0]).replace(b'&', b'&'), get_bytes(ent[0]))
>  

Here I would prefer something like

         for ent in entities:
-            output_xml = output_xml.replace(get_bytes(ent[0]).replace(b'&', b'&'), get_bytes(ent[0]))
+            entb = get_bytes(ent[0])
+            output_xml = output_xml.replace(entb.replace(b'&', b'&'), entb)

(line length, simplicity).

> Subject: [PATCH 03/11] python/samba/gp_parse: PY3 open file non-binary mode
>  for write_binary
> 
> Although this is unintuitive it's because we are writing unicode
> not bytes (both in PY2 & PY3). using the 'b' mode causes an error in
> PY3
> Signed-off-by: Noel Power <noel.power at suse.com>
> ---
>  python/samba/gp_parse/gp_csv.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/python/samba/gp_parse/gp_csv.py b/python/samba/gp_parse/gp_csv.py
> index cd30ef2bc00..e9f5a14a4d1 100644
> --- a/python/samba/gp_parse/gp_csv.py
> +++ b/python/samba/gp_parse/gp_csv.py
> @@ -93,7 +93,7 @@ class GPAuditCsvParser(GPParser):
>                  self.lines.append(line)
>  
>      def write_binary(self, filename):
> -        with open(filename, 'wb') as f:
> +        with open(filename, 'w') as f:
>              # This should be using a unicode writer, but it seems to be in the
>              # right encoding at least by default.
>              #

OK, yes, it is unintuitive. It might also be wrong for some people.

https://docs.python.org/3/library/functions.html#open says:

"In text mode, if encoding is not specified the encoding used is platform dependent:
locale.getpreferredencoding(False) is called to get the current locale encoding"

In Python 3 we can do this:

     def write_binary(self, filename):
-        with open(filename, 'wb') as f:
-            # This should be using a unicode writer, but it seems to be in the
-            # right encoding at least by default.
-            #
-            # writer = UnicodeWriter(f, quoting=csv.QUOTE_MINIMAL)
+        with open(filename, 'w', encoding=self.output_encoding) as f:
+            # In this case "binary" means "utf-8", so we let Python do that.
             writer = csv.writer(f, quoting=csv.QUOTE_MINIMAL)
             writer.writerow(self.header)
             for line in self.lines:

but not in 2 (no encoding keyword).

I've pushed a hacky version of this without reviewed-by, and I hope we can
come up with something better that still allows that lovely
"remove unused code" patch. Maybe codecs.open() would work here.

(I bet we have no tests for anything for people with non-utf-8 locales).

> Subject: [PATCH 07/11] python/samba/netcmd: misc PY2/PY2 compat changes for
                                                   ^^^^^^^
That's an easier target, but let's stick with PY2/PY3.

> @@ -399,9 +399,9 @@ class cmd_listall(Command):
>          msg = get_gpo_info(self.samdb, None)
>  
>          for m in msg:
> -            self.outf.write("GPO          : %s\n" % m['name'][0])
> -            self.outf.write("display name : %s\n" % m['displayName'][0])
> -            self.outf.write("path         : %s\n" % m['gPCFileSysPath'][0])
> +            self.outf.write("GPO          : %s\n" % str(m['name'][0]))
> +            self.outf.write("display name : %s\n" % str(m['displayName'][0]))
> +            self.outf.write("path         : %s\n" % str(m['gPCFileSysPath'][0]))
>              self.outf.write("dn           : %s\n" % m.dn)

I know I should know this by now, but why is this necessary?
Doesn't '"%s" % x' call 'x.__str__' just the same as 'str(x)'?

I have pushed a slightly modified, rebased, mostly RB+ed version to
https://gitlab.com/samba-team/devel/samba/commits/dbagnall-npower-python3-gpo-fix

cheers,
Douglas