[PATCH] ldb: The test api.py should not rely on order of entries in

Tue Jan 22 14:00:00 UTC 2019

On (22/01/19 14:20), Andrew Bartlett via samba-technical wrote:
>On Mon, 2019-01-21 at 22:33 +0100, Lukas Slebodnik wrote:
>> On (22/01/19 09:19), Andrew Bartlett via samba-technical wrote:
>> > 
>> > On Mon, 2019-01-21 at 19:50 +0100, Lukas Slebodnik via samba-
>> > technical
>> > wrote:
>> > > 
>> > >  
>> > > I spent more time debugging it on s390x and I can say that there
>> > > is
>> > > not any
>> > > issue with passing env PYTHONHASHSEED to the test.
>> > > 
>> > > The problem is with the value of env PYTHONHASHSEED.
>> > > 
>> > > It deterministically fails with the value 1.
>> > > 
>> > > So is there any reason why you insist on the value 1.
>> > > IMHO value 0 for PYTHONHASHSEED is perfectly valid and works on
>> > > all
>> > > architectures in fedora
>> > > https://koji.fedoraproject.org/koji/buildinfo?buildID=1180702
>> > > 
>> > > I can send a patch to change all values from 1 -> 0 if there are
>> > > not
>> > > any
>> > > objection.
>> > I would really prefer to understand where the issue really comes
>> > from
>> > if at all possible, this is just getting even more strange and
>> > changing
>> > magic numbers.  Even specifying PYTHONHASHSEED was meant to be a
>> > very
>> > short term hack, not a long-term solution, but so far a proper
>> > solution
>> > has been elusive...
>> > 
>> I have no idea why PYTHONHASHSEED was added in commit 19a4d3ca692e
>
>'make' would rebuild everything each time as the dependency tracker
>failed to agree the dependencies were unchanged. 
>
>> But following test failed for me when packaging libldb
>> 
>>     def test_repr(self):
>>         self.msg.dn = ldb.Dn(ldb.Ldb(), "dc=foo29")
>>         self.msg["dc"] = b"foo"
>>         if PY3:
>>             self.assertIn(repr(self.msg), [
>>                 "Message({'dn': Dn('dc=foo29'), 'dc':
>> MessageElement([b'foo'])})",
>>                 "Message({'dc': MessageElement([b'foo']), 'dn':
>> Dn('dc=foo29')})",
>>             ])
>>             self.assertIn(repr(self.msg.text), [
>>                 "Message({'dn': Dn('dc=foo29'), 'dc':
>> MessageElement([b'foo'])}).text",
>>                 "Message({'dc': MessageElement([b'foo']), 'dn':
>> Dn('dc=foo29')}).text",
>>             ])
>>         else:
>>             self.assertEqual(
>>                 repr(self.msg),
>>                 "Message({'dn': Dn('dc=foo29'), 'dc':
>> MessageElement(['foo'])})")
>>             self.assertEqual(
>>                 repr(self.msg.text),
>>                 "Message({'dn': Dn('dc=foo29'), 'dc':
>> MessageElement(['foo'])}).text")
>> 
>
>That should just be changed to be a deterministic test at the api.py
>level.  That is just a bug. 
>
>>  
>> =====================================================================
>> =
>>   FAIL: test_repr (__main__.LdbMsgTests)
>>   -----------------------------------------------------------------
>> -----
>>   Traceback (most recent call last):
>>     File "tests/python/api.py", line 2322, in test_repr
>>       "Message({'dn': Dn('dc=foo29'), 'dc':
>> MessageElement(['foo'])})")
>>   AssertionError: "Message({'dc': MessageElement(['foo']), 'dn':
>> Dn('dc=foo29')})" != "Message({'dn': Dn('dc=foo29'), 'dc':
>> MessageElement(['foo'])})"
>> 
>> So the issue is that implementation of py_ldb_msg_repr is not
>> deterministic
>> and depends on the ordering of values in dictionary.
>
>Sure, can either come up with a looser test or a stricter repr.  Is
>repr() typically deterministic?
>
>> static PyObject *py_ldb_msg_repr(PyLdbMessageObject *self)
>> {
>>     PyObject *dict = PyDict_New(), *ret, *repr;
>>     if (PyDict_Update(dict, (PyObject *)self) != 0)
>>         return NULL;
>>     repr = PyObject_Repr(dict);
>>     if (repr == NULL) {
>>         Py_DECREF(dict);
>>         return NULL;
>>     }
>>     ret = PyStr_FromFormat("Message(%s)", PyStr_AsUTF8(repr));
>>     Py_DECREF(repr);
>>     Py_DECREF(dict);
>>     return ret;
>> }
>> 
>> And order of values in dictionary depends on hash randomization which
>> can be
>> influenced by env PYTHONHASHSEED.
>
>I agree. 
>
>> I briefly checked C-api for dictionary and I was not able to find
>> simple solution to have a deterministic order for dictionary.
>> https://docs.python.org/2.7/c-api/dict.html
>> 
>> But you might know better why the env variable was added with such
>> value.
>
>For unrelated reasons (waf...), which is why I would like to get rid of
>it, and this only makes me want to get rid of it sooner.  
>
>I hope this clarifies things,
>
Thank you very much for explanation.

I've just realized that unreliable order of entries is already solved
in the test for python3

    def test_repr(self):
        self.msg.dn = ldb.Dn(ldb.Ldb(), "dc=foo29")
        self.msg["dc"] = b"foo"
        if PY3:
            self.assertIn(repr(self.msg), [
                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement([b'foo'])})",
                "Message({'dc': MessageElement([b'foo']), 'dn': Dn('dc=foo29')})",
            ])
            self.assertIn(repr(self.msg.text), [
                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement([b'foo'])}).text",
                "Message({'dc': MessageElement([b'foo']), 'dn': Dn('dc=foo29')}).text",
            ])

Therefore we can use the same approach for python2. There will be just
difference between bytes and strings.

New patch atached.

LS
-------------- next part --------------
From bf8dca7ba72ffcdf8a49ad33c72ebeed337a8531 Mon Sep 17 00:00:00 2001
From: Lukas Slebodnik <lslebodn at fedoraproject.org>
Date: Tue, 22 Jan 2019 10:04:02 +0100
Subject: [PATCH] ldb: The test api.py should not rely on order of entries in
 dict

Test failed on s390x but there is a simple reproducer for any
architecture.

The built-in function repr returns the canonical string representation
of the object. We needn't care about order attributes in string
representation. Therefore test should pass for any order.

    for i in {1..30}; do
        PYTHONHASHSEED=random \
        python2 -c 'import ldb; msg = ldb.Message(); msg.dn = ldb.Dn(ldb.Ldb(), "dc=foo29"); msg["dc"] = b"foo"; print(repr(msg)) '
    done

  ======================================================================
  FAIL: test_repr (__main__.LdbMsgTests)
  ----------------------------------------------------------------------
  Traceback (most recent call last):
    File "tests/python/api.py", line 2322, in test_repr
      "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])})")
  AssertionError: "Message({'dc': MessageElement(['foo']), 'dn': Dn('dc=foo29')})" != "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])})"
  ----------------------------------------------------------------------
  Ran 1025 tests in 29.146s
  FAILED (failures=1)

Signed-off-by: Lukas Slebodnik <lslebodn at fedoraproject.org>
---
 lib/ldb/tests/python/api.py | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/lib/ldb/tests/python/api.py b/lib/ldb/tests/python/api.py
index 1d9f33f8f73f3ed0e15722a47147f00ee86c3631..e8826b5af3b30cf761b55e36f119385b648c5664 100755
--- a/lib/ldb/tests/python/api.py
+++ b/lib/ldb/tests/python/api.py
@@ -2317,12 +2317,14 @@ class LdbMsgTests(TestCase):
                 "Message({'dc': MessageElement([b'foo']), 'dn': Dn('dc=foo29')}).text",
             ])
         else:
-            self.assertEqual(
-                repr(self.msg),
-                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])})")
-            self.assertEqual(
-                repr(self.msg.text),
-                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])}).text")
+            self.assertIn(repr(self.msg), [
+                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])})",
+                "Message({'dc': MessageElement(['foo']), 'dn': Dn('dc=foo29')})",
+            ])
+            self.assertIn(repr(self.msg.text), [
+                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])}).text",
+                "Message({'dc': MessageElement(['foo']), 'dn': Dn('dc=foo29')}).text",
+            ])
 
     def test_len(self):
         self.assertEqual(0, len(self.msg))
-- 
2.20.1