[PATCH] ldb: Use deterministic order of dict for tests

Andrew Bartlett abartlet at samba.org
Tue Jan 22 01:20:57 UTC 2019


On Mon, 2019-01-21 at 22:33 +0100, Lukas Slebodnik wrote:
> On (22/01/19 09:19), Andrew Bartlett via samba-technical wrote:
> > 
> > On Mon, 2019-01-21 at 19:50 +0100, Lukas Slebodnik via samba-
> > technical
> > wrote:
> > > 
> > >  
> > > I spent more time debugging it on s390x and I can say that there
> > > is
> > > not any
> > > issue with passing env PYTHONHASHSEED to the test.
> > > 
> > > The problem is with the value of env PYTHONHASHSEED.
> > > 
> > > It deterministically fails with the value 1.
> > > 
> > > So is there any reason why you insist on the value 1.
> > > IMHO value 0 for PYTHONHASHSEED is perfectly valid and works on
> > > all
> > > architectures in fedora
> > > https://koji.fedoraproject.org/koji/buildinfo?buildID=1180702
> > > 
> > > I can send a patch to change all values from 1 -> 0 if there are
> > > not
> > > any
> > > objection.
> > I would really prefer to understand where the issue really comes
> > from
> > if at all possible, this is just getting even more strange and
> > changing
> > magic numbers.  Even specifying PYTHONHASHSEED was meant to be a
> > very
> > short term hack, not a long-term solution, but so far a proper
> > solution
> > has been elusive...
> > 
> I have no idea why PYTHONHASHSEED was added in commit 19a4d3ca692e

'make' would rebuild everything each time as the dependency tracker
failed to agree the dependencies were unchanged. 

> But following test failed for me when packaging libldb
> 
>     def test_repr(self):
>         self.msg.dn = ldb.Dn(ldb.Ldb(), "dc=foo29")
>         self.msg["dc"] = b"foo"
>         if PY3:
>             self.assertIn(repr(self.msg), [
>                 "Message({'dn': Dn('dc=foo29'), 'dc':
> MessageElement([b'foo'])})",
>                 "Message({'dc': MessageElement([b'foo']), 'dn':
> Dn('dc=foo29')})",
>             ])
>             self.assertIn(repr(self.msg.text), [
>                 "Message({'dn': Dn('dc=foo29'), 'dc':
> MessageElement([b'foo'])}).text",
>                 "Message({'dc': MessageElement([b'foo']), 'dn':
> Dn('dc=foo29')}).text",
>             ])
>         else:
>             self.assertEqual(
>                 repr(self.msg),
>                 "Message({'dn': Dn('dc=foo29'), 'dc':
> MessageElement(['foo'])})")
>             self.assertEqual(
>                 repr(self.msg.text),
>                 "Message({'dn': Dn('dc=foo29'), 'dc':
> MessageElement(['foo'])}).text")
> 

That should just be changed to be a deterministic test at the api.py
level.  That is just a bug. 

>  
> =====================================================================
> =
>   FAIL: test_repr (__main__.LdbMsgTests)
>   -----------------------------------------------------------------
> -----
>   Traceback (most recent call last):
>     File "tests/python/api.py", line 2322, in test_repr
>       "Message({'dn': Dn('dc=foo29'), 'dc':
> MessageElement(['foo'])})")
>   AssertionError: "Message({'dc': MessageElement(['foo']), 'dn':
> Dn('dc=foo29')})" != "Message({'dn': Dn('dc=foo29'), 'dc':
> MessageElement(['foo'])})"
> 
> So the issue is that implementation of py_ldb_msg_repr is not
> deterministic
> and depends on the ordering of values in dictionary.

Sure, can either come up with a looser test or a stricter repr.  Is
repr() typically deterministic?

> static PyObject *py_ldb_msg_repr(PyLdbMessageObject *self)
> {
>     PyObject *dict = PyDict_New(), *ret, *repr;
>     if (PyDict_Update(dict, (PyObject *)self) != 0)
>         return NULL;
>     repr = PyObject_Repr(dict);
>     if (repr == NULL) {
>         Py_DECREF(dict);
>         return NULL;
>     }
>     ret = PyStr_FromFormat("Message(%s)", PyStr_AsUTF8(repr));
>     Py_DECREF(repr);
>     Py_DECREF(dict);
>     return ret;
> }
> 
> And order of values in dictionary depends on hash randomization which
> can be
> influenced by env PYTHONHASHSEED.

I agree. 

> I briefly checked C-api for dictionary and I was not able to find
> simple solution to have a deterministic order for dictionary.
> https://docs.python.org/2.7/c-api/dict.html
> 
> But you might know better why the env variable was added with such
> value.

For unrelated reasons (waf...), which is why I would like to get rid of
it, and this only makes me want to get rid of it sooner.  

I hope this clarifies things,

Andrew Bartlett

-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba






More information about the samba-technical mailing list