One hundred python patches

Douglas Bagnall douglas.bagnall at catalyst.net.nz
Fri Oct 12 00:46:37 UTC 2018


I have about 100 patches to python files here:

https://gitlab.com/samba-team/devel/samba/commits/pyflake3-fixes

They are WIP. Most of them are probably correct, though collectively
they don't like to pass CI.

Many are aimed at Python 3, though many are fixing plain coding
errors. For example there are a few of this kind of thing

-                sel.fail("No exception should get %s" % msg_exp)
+                self.fail("No exception should get %s" % msg_exp)

in code paths we can assume are rarely travelled, and there are quite
a few examples of this:

     def test_search_non_mapped(self):
         """Looking up by non-mapped attribute"""
         msg = self.ldb.search(expression="(cn=Administrator)")
         self.assertEquals(len(msg), 1)
         self.assertEquals(msg[0]["cn"], "Administrator")

-    def test_search_non_mapped(self):
+    def test_search_mapped(self):
         """Looking up by mapped attribute"""
         msg = self.ldb.search(expression="(name=Backup Operators)")
         self.assertEquals(len(msg), 1)
         self.assertEquals(str(msg[0]["name"]), "Backup Operators")

what we have had there since 2007 is two test methods called
"test_search_non_mapped", meaning only the second one has ever run.
Unshadowing these secret tests is a great way to increase coverage,
though unfortunately not all of them pass. Sometimes both tests are
identical, so we can easily get rid of one, and sometimes they seem
contradictory, like this in lib/ldb/tests/python/api.py:

    def test_get_casefold(self):
        x = ldb.Dn(self.ldb, "dc=foo14,bar=bloe")
        self.assertEqual(x.get_casefold(), "DC=FOO14,BAR=bloe")

[...]

    def test_get_casefold(self):
        dn = ldb.Dn(self.ldb, "cn=foo,dc=base")
        self.assertEqual(dn.get_casefold(), 'CN=FOO,DC=BASE')


There are also a heap of boring patches reducing unused imports,
removing unused variables, and updating the print syntax.

I found these problems like so:

git ls-files | xargs file -i | grep text/x-python |cut -d: -f 1 > all-python-files
grep -v third_party all-python-files > non-3rd-party-python-files
grep -v 'heimdal'  non-3rd-party-python-files > non-heimdal-python-files
grep -v 'wscript'  non-heimdal-python-files > non-wscript-python-files
wc non-wscript-python-files # 400

xargs pyflakes3 < non-wscript-python-files

Then I started winnowing down the error classes as I got sick of each
one:

xargs pyflakes3 < non-wscript-python-files | grep -v imported
xargs pyflakes3 < non-wscript-python-files | grep -v imported | grep -v 'but never used'
xargs pyflakes3 < non-wscript-python-files | grep -v imported | grep -v 'but never used' | grep -v 'star import'
[etc]

Pyflakes is a very basic static analyser, not a style nit-picker, and
pyflakes3 uses Python 3 syntax.

There are many effective false positives for local variables that are
assigned but not used, from patterns like this:

  def test_x(self):
      try:
          res = ldb.search(...)
      except:
          fail...

where we strictly don't need "res = " but it isn't hurting. I *would*
like to reduce the number of unnecessary imports but it is not a small
job.

Douglas



More information about the samba-technical mailing list