[PATCH] ldb: Use deterministic order of dict for tests

Mon Jan 21 21:33:37 UTC 2019

On (22/01/19 09:19), Andrew Bartlett via samba-technical wrote:
>On Mon, 2019-01-21 at 19:50 +0100, Lukas Slebodnik via samba-technical
>wrote:
>> 
>> I spent more time debugging it on s390x and I can say that there is
>> not any
>> issue with passing env PYTHONHASHSEED to the test.
>> 
>> The problem is with the value of env PYTHONHASHSEED.
>> 
>> It deterministically fails with the value 1.
>> 
>> So is there any reason why you insist on the value 1.
>> IMHO value 0 for PYTHONHASHSEED is perfectly valid and works on all
>> architectures in fedora
>> https://koji.fedoraproject.org/koji/buildinfo?buildID=1180702
>> 
>> I can send a patch to change all values from 1 -> 0 if there are not
>> any
>> objection.
>
>I would really prefer to understand where the issue really comes from
>if at all possible, this is just getting even more strange and changing
>magic numbers.  Even specifying PYTHONHASHSEED was meant to be a very
>short term hack, not a long-term solution, but so far a proper solution
>has been elusive...
>

I have no idea why PYTHONHASHSEED was added in commit 19a4d3ca692e

But following test failed for me when packaging libldb

    def test_repr(self):
        self.msg.dn = ldb.Dn(ldb.Ldb(), "dc=foo29")
        self.msg["dc"] = b"foo"
        if PY3:
            self.assertIn(repr(self.msg), [
                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement([b'foo'])})",
                "Message({'dc': MessageElement([b'foo']), 'dn': Dn('dc=foo29')})",
            ])
            self.assertIn(repr(self.msg.text), [
                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement([b'foo'])}).text",
                "Message({'dc': MessageElement([b'foo']), 'dn': Dn('dc=foo29')}).text",
            ])
        else:
            self.assertEqual(
                repr(self.msg),
                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])})")
            self.assertEqual(
                repr(self.msg.text),
                "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])}).text")

  ======================================================================
  FAIL: test_repr (__main__.LdbMsgTests)
  ----------------------------------------------------------------------
  Traceback (most recent call last):
    File "tests/python/api.py", line 2322, in test_repr
      "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])})")
  AssertionError: "Message({'dc': MessageElement(['foo']), 'dn': Dn('dc=foo29')})" != "Message({'dn': Dn('dc=foo29'), 'dc': MessageElement(['foo'])})"

So the issue is that implementation of py_ldb_msg_repr is not deterministic
and depends on the ordering of values in dictionary.

static PyObject *py_ldb_msg_repr(PyLdbMessageObject *self)
{
    PyObject *dict = PyDict_New(), *ret, *repr;
    if (PyDict_Update(dict, (PyObject *)self) != 0)
        return NULL;
    repr = PyObject_Repr(dict);
    if (repr == NULL) {
        Py_DECREF(dict);
        return NULL;
    }
    ret = PyStr_FromFormat("Message(%s)", PyStr_AsUTF8(repr));
    Py_DECREF(repr);
    Py_DECREF(dict);
    return ret;
}

And order of values in dictionary depends on hash randomization which can be
influenced by env PYTHONHASHSEED.

I briefly checked C-api for dictionary and I was not able to find
simple solution to have a deterministic order for dictionary.
https://docs.python.org/2.7/c-api/dict.html

But you might know better why the env variable was added with such value.

LS