[Samba] Error running "samba-tool dbcheck" after going from 4.8.6 to 4.9.2
Andrew Bartlett
abartlet at samba.org
Tue Nov 13 21:19:54 UTC 2018
On Mon, 2018-11-12 at 20:36 +0000, Noel Power wrote:
> Hi
>
> On 09/11/2018 10:21, Andrew Bartlett wrote:
> > On Fri, 2018-11-09 at 11:46 +0300, Taner Tas via samba wrote:
> > > > I just tested Samba 4.9.2 on one of my DCs, previously running
> > > > version 4.8.6. Immediately after install, I ran "samba-tool dbcheck"
> > > > and got the following:
> > > >
> > > > Checking 511 objects
> > > > ERROR(<type 'exceptions.UnicodeDecodeError'>): uncaught exception -
> > > > 'ascii' codec can't decode byte 0xc3 in position 25: ordinal not in
> > > > range(128)
> > > > File
> > > > "/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/__init__.py",
> > > > line 177, in _run
> > > > return self.run(*args, **kwargs)
> > > > File
> > > > "/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/dbcheck.py",
> > > > line 157, in run
> > > > controls=controls, attrs=attrs)
> > > > File
> > > > "/usr/local/samba/lib64/python2.7/site-packages/samba/dbchecker.py",
> > > > line 222, in check_database
> > > > error_count += self.check_object(object.dn, attrs=attrs)
> > > > File
> > > > "/usr/local/samba/lib64/python2.7/site-packages/samba/dbchecker.py",
> > > > line 2315, in check_object
> > > > expected_dn = ldb.Dn(self.samdb, "RDN=RDN,%s" % (parent_dn))
> > > >
> > > >
> > > > The error disappears after going back to Samba 4.8.6. The underlying
> > > > OS is CentOS 7.4.
> > > >
> > > > Although the names my users use at logon don't have accented
> > > > characters, several of them have them in their complete names.
> > > > Several of the group names also have accented characters.
> > > >
> > > > This is likely the same error reported under Bug 13616, which
> > > > meanwhile has been declared as closed.
> > > > https://bugzilla.samba.org/show_bug.cgi?id=13616
> > > >
> > >
> > > The bugfix seems applied to ldb version 1.4.3 so samba should recompile
> > > against ldb-1.4.3 regardless any samba-4.9 version.
> >
> > Thankfully we enforce the reverse (that samba 4.9.2 requires ldb 1.4.3)
> > pretty strongly.
> >
> > What is being seen here is another manifestation of the same issue,
> > that is as we migrated parts of Samba to building and running with
> > python3, the distinction between string and unicode has become more
> > important.
>
> yep, so this is quite horrible, I scratched my head with this for a
> while, mixing (python2)strings and unicode can be very tricky. Changing
> the ParseTuple format to 'es' seemed the correct thing to do. However
> there is a side-affect of doing this that wasn't immediately obvious to
> me from reading the documentation
>
> with previous format 's' behaviour is as follows:
>
> python2
>
> (string) basically is passed through unencoded (this is probably bad but
> it is the default way python2 handles things)
>
> (unicode) unicode is encoded to the default encoding ('ascii') (bad
> because we don't expect or want that)
>
> python3
>
> (unicode/string) is encoded to the default encoding ("utf8")
>
> with new format we added as a 'es' (and specified encoding as 'utf8')
>
> python2
>
> (string) is re-encoded to 'utf8' (this is both unexpected and bad since
> effectively this operation will be
> string.decode('ascii').encode('utf8')) This first decode is the root
> cause of the current problem.
>
> (unicode) is encoded to 'utf8' (this is good)
>
> python3
>
> (string/unicode) is encoded with the specified encoding ('utf8')
>
> (bytes) will not be accepted
>
> What can we do ?
>
> a) we could ensure every call to Ldb.Dn passes unicode
>
> This would be alot of changes, and always the risk we miss some and
> also going forward everyone needs to remember when writing any code that
> they *need* to pass unicode. Seems unworkable
>
> b) we could use the format 'et' which as far as I understand behaves as
> follows:
>
> python2
>
> (string) basically is passed through unencoded (same behaviour as old
> python2 code)
>
> (unicode) encodes to specified format 'utf8' (this is good)
>
> (bytes) will not be accepted
>
> python3
>
> (string/uncode) encodes to specified format ('utf8') in our case
>
> (bytes) will be accepted (with the assumption the bytes contains bytes
> encoded in the specified encoding) This is might be problematic in that
> previously we would have gotten an error attempting to pass bytes (and
> also the bytes might not be encoded with the correct encoding)
>
> Obviously b seems the saner option, but... there is some risk with
> python3 (accepting the bytes)
I think this could be sane. It is quite likely/reasonable that a ldb
value could be passed, as bytes, to ldb.Dn(), as DNs are put in
attribute values often.
> I think the best option is a hybrid approach to use the 'et' format with
> python2 and the 'es' format with python3.
>
> This is yet another attempt to fix the same problem again so I really
> would like another pair (or more) of eyes and/or opinions on this. I've
> attached master version of patch and also 4.9 version (without the ldb
> version related bits, I'll add those if/when we are happy) And if there
> is consensus this is the correct approach we should change the other
> instances of use of 'es' with ParseTuple in the code too
>
>
> > Noel has been handling this and I'm sure we can sort it out.
>
> If you could give the 4.9 (mostly complete, just missing ldb versioning
> related bits) patch a spin that would be really helpful
Additionally, we can avoid some of the ldb.Dn() work with soemthing
like this (untested):
diff --git a/python/samba/dbchecker.py b/python/samba/dbchecker.py
index 9af11162ce5..ac8fc563b48 100644
--- a/python/samba/dbchecker.py
+++ b/python/samba/dbchecker.py
@@ -2328,7 +2328,9 @@ newSuperior: %s""" % (str(from_dn), str(to_rdn), str(to_base)))
parent_dn = deleted_objects_dn
if parent_dn is None:
parent_dn = obj.dn.parent()
- expected_dn = ldb.Dn(self.samdb, "RDN=RDN,%s" % (parent_dn))
+ expected_dn = parent_dn
+ if expected_dn.add_child("RDN=RDN") == False:
+ raise CommandError("Failed to add RDN to %s" % expected_dn)
expected_dn.set_component(0, obj.dn.get_rdn_name(), name_val)
if obj.dn == deleted_objects_dn:
(However it might also blow up in our face due to ldb_dn_copy() not
filling in the ldb pointer on the DN, a different long-standing issue)
This is all rather difficult, I'll see if I can get some other views on
this around the office.
Andrew Bartlett
--
Andrew Bartlett
https://samba.org/~abartlet/
Authentication Developer, Samba Team https://samba.org
Samba Development and Support, Catalyst IT
https://catalyst.net.nz/services/samba
More information about the samba
mailing list