convert_string + librpc/ndr (via openchange) == fail
Jeremy Allison
jra at samba.org
Mon May 16 17:41:10 MDT 2011
On Mon, May 16, 2011 at 01:26:25PM +0200, sean finney wrote:
> Hi all,
>
> Trying latest openchange trunk with a fairly recent samba4 master, I've run
> into problems regarding character conversion in:
>
> http://tracker.openchange.org/issues/359
>
> A little more digging and I found:
>
> commit 15e84a9a09c5a86416e964a3258ee35718fbf45a
> Author: Andrew Tridgell <tridge at samba.org>
> Date: Thu Mar 24 10:59:41 2011 +1100
>
> charcnv: removed the allow_badcharcnv and allow_bad_conv options to convert_
>
> we shouldn't accept bad multi-byte strings, it just hides problems
>
> Problem is, in some cases (NDR, in this case) a lot of these strings
> are neither ascii nor unicode (i.e. cp1252), but the code seems to assume
> it's one or the other (librpc/ndr/ndr_string.c:34), and will unconditionally
> pass it to a convert_string_ family function, which will then fail if
> the string contains characters with the high bit set. AFAICT, previously
> "LIBNDR_FLAG_STR_ASCII" was being meant to imply/support "non-UTF8 8-bit"
> and not actually "ascii 7-bit", and the above commit reversed that.
>
> This NDR code is currently being used by openchange for processing MAPI
> property values, which quite often have such latin1/iso-8859-1 characters
> (names, subjects, locations, mail message bodies, etc). In the above
> linked bug, I've pondered just skipping ndr_pull_string and doing some
> kind of strdup+manual offset incrementing in the case of non-unicode
> strings, which works in the immediate area of code in that report.
> BUT (big BUT), we just run into problems later, as there's a big mass
> of code generated via IDL interfaces with samba's pidl, which also
> contain ndr based string parsing with the same problem.
>
> Would it be out of the question to have ndr_pull_string do something a
> bit smarter/more lenient in the case of LIBNDR_FLAG_STR_ASCII?
You're mistaking the meaning of LIBNDR_FLAG_STR_ASCII. It doesn't
actually mean ascii (as in 7-bit clean characters) - it actually
means what we call in Samba "dos charset". Look in librpc/ndr/ndr_string.c:
47 if (flags & LIBNDR_FLAG_STR_ASCII) {
48 chset = CH_DOS;
49 byte_mul = 1;
50 flags &= ~LIBNDR_FLAG_STR_ASCII;
51 }
Note that it sets chset to CH_DOS - which in Samba is set
by default to CP850 (the Windows equivalent to iso8859-1).
My guess is that OpenChange isn't setting this up in the
same way as Samba on configure, so you're getting the fallback
default of "ascii" here.
See the code in source3/configure.in that selects CP850 IBM850
as the default names for this conversion.
Jeremy.
More information about the samba-technical
mailing list