[linux-cifs-client] [PATCH] cifs: Fix insufficient memory allocation for nativeFileSystem field

Tue Apr 7 14:59:46 GMT 2009

On Tue, Apr 7, 2009 at 8:15 AM, Suresh Jayaraman <sjayaraman at suse.de> wrote:
> Jeff Layton wrote:
>> On Mon, 06 Apr 2009 22:33:09 +0530
>> Suresh Jayaraman <sjayaraman at suse.de> wrote:
>>
>>> Steve French wrote:
>>>> I don't think that we should be using these size assumptions
>>>> (multiples of UCS stringlen).    A new UCS helper function should be
>>>> created that calculates how much memory would be needed for a
>>>> converted string - and we need to use this before we do the malloc and
>>>> string conversion.  In effect a strlen and strnlen function that takes
>>>> a target code page argument.  For strings that will never be more than
>>>> a hundred bytes this may not be needed, and we can use the length
>>>> assumption, but since mallocs in kernel can be so expensive I would
>>>> rather calculate the actual string length needed for the target.
>>> Ah, ok. I thought of writing a little function based on
>>> cifs_strncpy_to_host() and adding a comment like below:
>>>
>>> /* UniStrnlen() returns length in 16 bit Unicode  characters
>>>  * (UCS-2) with base length of 2 bytes per character. An UTF-8
>>>  * character can be up to 8 bytes maximum, so we need to
>>>  * allocate (len/2) * 4 bytes (or) (4 * len) bytes for the
>>>  * UTF-8 string */
>>>
>>
>> I think you'll have to basically do the conversion twice. Walk the
>> string once and convert each character determine its length and then
>> discard it. Get the total and allocate that many bytes (plus the null
>
> Thanks for explaining. It seems adding a new UCS helper that computes
> length in bytes like the below would be good enough and make use of it
> to compute length for memory allocation.
>
>> termination), and do the conversion again into the buffer.
>
> Do we still need this conversion again?
>
>
> diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
> index 14eb9a2..0396bdc 100644
> --- a/fs/cifs/cifs_unicode.h
> +++ b/fs/cifs/cifs_unicode.h
> @@ -159,6 +159,23 @@ UniStrnlen(const wchar_t *ucs1, int maxlen)
>  }
>
>  /*
> + * UniStrnlenBytes: Return the length in bytes of a UTF-8 string
> + */
> +static inline size_t
> +UniStrnlenBytes(const unsigned char *str, int maxlen)
> +{
> +       size_t nbytes = 0;
> +       wchar_t *uni;
> +
> +       while (*str++) {
> +               /* convert each char, find its length and add to nbytes */
> +               if (char2uni(str, maxlen, uni) > 0)
> +                       nbytes += strnlen(uni, NLS_MAX_CHARSET_SIZE);
> +       }
> +       return nbytes;
> +}
> +
> +/*
>
> We would still be needing the version (UniStrnlen) that returns length
> in characters also.
>
>>
>> I'm not truly convinced this is really necessary though. You have to
>> figure that kmalloc is a power-of-two allocator. If you kmalloc 17
>> bytes, you get 32 anyway. You'll probably end up using roughly the same
>> amount of memory that you would have had you just estimated the size.

Shaggy made the comment that the string length calculation probably
won't matter (exact size vs. estimate) for most cases in cifs since
small allocations off the slab are fairly fast and it doesn't hurt to
overallocate by this amount.    Although for the typical cases a
Unicode string usually will shrink when converted to UTF-8 obviously
we have to allow for the maximum size conversion.

Except for long lived strings, for temporary Unicode strings
conversions that start with a Unicode string length of 256 wchars long
or shorter, probably is no point in calculating the string length
since the slab allocation for the worst case target is fast enough.
Obviously for path lengths though it can make a huge difference
(\\server\share\directory1\directory2\directory3\ etc.) and we ought
to calculate the exact length.

-- 
Thanks,

Steve