[linux-cifs-client] [PATCH] cifs: Fix insufficient memory allocation for nativeFileSystem field

Steve French smfrench at gmail.com
Mon Apr 6 16:41:13 GMT 2009


I don't think that we should be using these size assumptions
(multiples of UCS stringlen).    A new UCS helper function should be
created that calculates how much memory would be needed for a
converted string - and we need to use this before we do the malloc and
string conversion.  In effect a strlen and strnlen function that takes
a target code page argument.  For strings that will never be more than
a hundred bytes this may not be needed, and we can use the length
assumption, but since mallocs in kernel can be so expensive I would
rather calculate the actual string length needed for the target.

On Mon, Apr 6, 2009 at 9:30 AM, simo <idra at samba.org> wrote:
> On Mon, 2009-04-06 at 10:08 -0400, Jeff Layton wrote:
>> On Mon, 06 Apr 2009 14:02:25 +0000
>> simo <idra at samba.org> wrote:
>>
>> > On Mon, 2009-04-06 at 09:22 -0400, Jeff Layton wrote:
>> > > > True. Seems I was influenced by a comment in fs/cifs/sess.c
>> > > >
>> > > > 313         /* UTF-8 string will not grow more than four times as
>> > > big as
>> > > > UCS-16 */
>> > > >
>> > >
>> > > That looks like that's wrong (or at least potentially so). AFAICT,
>> > > UTF-8
>> > > allows up to 6 bytes per character. I suppose that it's possible that
>> > > none of the characters allowed in UCS-16 will ever translate to a
>> > > character that's more than 4 bytes, but I'd like to see that confirmed
>> > > before we depend on it.
>> >
>> > I wonder whats UCS-16 tho, UCS-16 does not exist :)
>> >
>> > It may be either UTF16 or UCS2.
>> > Both these charsets have a base length of 2 bytes per character. UCS2 is
>> > limited to 65535 values, while UTF-16 is a multi-word charset.
>> >
>> > If the comment above is to be read as "UTF-8 string will not grow more
>> > than four times as big as UCS-2/UTF-16" then what it is saying is that
>> > at maximum an UTF-8 chars can be 4 words (or 8 bytes long).
>> > IIRC UTF-8 chars length is maximum 6 bytes, so an 8 byte per char max
>> > estimate seem correct.
>> >
>> > If "length" in the code was the length in bytes, and not the number of
>> > characters, of an UCS-2/UTF-16 string than 4*length should, indeed be
>> > long enough.
>> >
>>
>> Ahh, you're correct. I guess I'm accustomed to thinking about lengths in
>> bytes. I guess though this means that 4*length is allocating too
>> much...wouldn't 3*length then be right (assuming NULL termination is
>> also accounted for) ?
>>
>> Regardless of the math, I'd like to see all of this moved into some
>> nice, well commented helper functions instead of being open-coded all
>> over the place. It's just too easy to get this stuff wrong. Let's solve
>> this in a way that makes it easier in the future.
>
> Yes, this is the way to go.
>
> Simo.
>
> --
> Simo Sorce
> Samba Team GPL Compliance Officer <simo at samba.org>
> Principal Software Engineer at Red Hat, Inc. <simo at redhat.com>
>
>



-- 
Thanks,

Steve


More information about the linux-cifs-client mailing list