[Samba] Conversion error: Illegal multibyte sequence
Jeremy Allison
jra at samba.org
Tue Sep 10 10:48:57 MDT 2013
On Tue, Sep 10, 2013 at 11:43:40AM +0200, Volker Lendecke wrote:
> Hi, Jeremy!
>
> On Mon, Sep 09, 2013 at 03:40:06PM -0700, Jeremy Allison wrote:
> > Ok, here is a fix for 3.6.x. Can you test this and see
> > if it fixes the problem ? If so, I'll get this fixed
> > in master and back-ported to all releases.
> >
> > Thanks !
> >
> > Jeremy.
>
> > diff --git a/source3/smbd/mangle_hash2.c b/source3/smbd/mangle_hash2.c
> > index 5aafe2f..e1aedf1 100644
> > --- a/source3/smbd/mangle_hash2.c
> > +++ b/source3/smbd/mangle_hash2.c
> > @@ -626,7 +626,8 @@ static bool is_legal_name(const char *name)
> > while (*name) {
> > if (((unsigned int)name[0]) > 128 && (name[1] != 0)) {
> > /* Possible start of mb character. */
> > - char mbc[2];
> > + size_t size = 0;
> > + (void)next_codepoint(name, &size);
> > /*
> > * Note that if CH_UNIX is utf8 a string may be 3
> > * bytes, but this is ok as mb utf8 characters don't
> > @@ -634,7 +635,7 @@ static bool is_legal_name(const char *name)
> > * for mb UNIX asian characters like Japanese (SJIS) here.
> > * JRA.
> > */
> > - if (convert_string(CH_UNIX, CH_UTF16LE, name, 2, mbc, 2, False) == 2) {
> > + if (size == 2) {
> > /* Was a good mb string. */
> > name += 2;
> > continue;
>
> Can you explain what this check is supposed to do at all? I
> don't get it ... :-)
It's an old, old check back from when SJIS and EUC were
common multi-byte systems.
SJIS especially has the property that the second byte
can contain a value <127 as part of the 2-byte char
set. So if CH_UNIX is set to a char set with such a
property we can't walk it as bytes, but must see if
a pair of values [0] (> 0x80) [1] (any value) can be
converted into a valid multi-byte char, in which case
we ignore it (otherwise we might look at the second
byte value of ':' or something and consider it invalid).
I thought about removing this and re-writing it, but
it made my brain hurt (and might break some very old
systems :-). So moving to next_codepoint() which checks
the next char len without causing the conversion error
messages seemed the simplest fix :-).
Jeremy.
More information about the samba
mailing list