[Samba] Conversion error: Illegal multibyte sequence

Jeremy Allison jra at samba.org
Tue Sep 10 10:48:57 MDT 2013


On Tue, Sep 10, 2013 at 11:43:40AM +0200, Volker Lendecke wrote:
> Hi, Jeremy!
> 
> On Mon, Sep 09, 2013 at 03:40:06PM -0700, Jeremy Allison wrote:
> > Ok, here is a fix for 3.6.x. Can you test this and see
> > if it fixes the problem ? If so, I'll get this fixed
> > in master and back-ported to all releases.
> > 
> > Thanks !
> > 
> > Jeremy.
> 
> > diff --git a/source3/smbd/mangle_hash2.c b/source3/smbd/mangle_hash2.c
> > index 5aafe2f..e1aedf1 100644
> > --- a/source3/smbd/mangle_hash2.c
> > +++ b/source3/smbd/mangle_hash2.c
> > @@ -626,7 +626,8 @@ static bool is_legal_name(const char *name)
> >  	while (*name) {
> >  		if (((unsigned int)name[0]) > 128 && (name[1] != 0)) {
> >  			/* Possible start of mb character. */
> > -			char mbc[2];
> > +			size_t size = 0;
> > +			(void)next_codepoint(name, &size);
> >  			/*
> >  			 * Note that if CH_UNIX is utf8 a string may be 3
> >  			 * bytes, but this is ok as mb utf8 characters don't
> > @@ -634,7 +635,7 @@ static bool is_legal_name(const char *name)
> >  			 * for mb UNIX asian characters like Japanese (SJIS) here.
> >  			 * JRA.
> >  			 */
> > -			if (convert_string(CH_UNIX, CH_UTF16LE, name, 2, mbc, 2, False) == 2) {
> > +			if (size == 2) {
> >  				/* Was a good mb string. */
> >  				name += 2;
> >  				continue;
> 
> Can you explain what this check is supposed to do at all? I
> don't get it ... :-)

It's an old, old check back from when SJIS and EUC were
common multi-byte systems.

SJIS especially has the property that the second byte
can contain a value <127 as part of the 2-byte char
set. So if CH_UNIX is set to a char set with such a
property we can't walk it as bytes, but must see if
a pair of values [0] (> 0x80) [1] (any value) can be
converted into a valid multi-byte char, in which case
we ignore it (otherwise we might look at the second
byte value of ':' or something and consider it invalid).

I thought about removing this and re-writing it, but
it made my brain hurt (and might break some very old
systems :-). So moving to next_codepoint() which checks
the next char len without causing the conversion error
messages seemed the simplest fix :-).

Jeremy.


More information about the samba mailing list