[linux-cifs-client] [PATCH] cifs: Fix insufficient memory allocation for nativeFileSystem field

Tue Apr 7 13:15:46 GMT 2009

Jeff Layton wrote:
> On Mon, 06 Apr 2009 22:33:09 +0530
> Suresh Jayaraman <sjayaraman at suse.de> wrote:
> 
>> Steve French wrote:
>>> I don't think that we should be using these size assumptions
>>> (multiples of UCS stringlen).    A new UCS helper function should be
>>> created that calculates how much memory would be needed for a
>>> converted string - and we need to use this before we do the malloc and
>>> string conversion.  In effect a strlen and strnlen function that takes
>>> a target code page argument.  For strings that will never be more than
>>> a hundred bytes this may not be needed, and we can use the length
>>> assumption, but since mallocs in kernel can be so expensive I would
>>> rather calculate the actual string length needed for the target.
>> Ah, ok. I thought of writing a little function based on
>> cifs_strncpy_to_host() and adding a comment like below:
>>
>> /* UniStrnlen() returns length in 16 bit Unicode  characters
>>  * (UCS-2) with base length of 2 bytes per character. An UTF-8
>>  * character can be up to 8 bytes maximum, so we need to
>>  * allocate (len/2) * 4 bytes (or) (4 * len) bytes for the
>>  * UTF-8 string */
>>
> 
> I think you'll have to basically do the conversion twice. Walk the
> string once and convert each character determine its length and then
> discard it. Get the total and allocate that many bytes (plus the null

Thanks for explaining. It seems adding a new UCS helper that computes
length in bytes like the below would be good enough and make use of it
to compute length for memory allocation.

> termination), and do the conversion again into the buffer.

Do we still need this conversion again?

diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
index 14eb9a2..0396bdc 100644
--- a/fs/cifs/cifs_unicode.h
+++ b/fs/cifs/cifs_unicode.h
@@ -159,6 +159,23 @@ UniStrnlen(const wchar_t *ucs1, int maxlen)
 }

 /*
+ * UniStrnlenBytes: Return the length in bytes of a UTF-8 string
+ */
+static inline size_t
+UniStrnlenBytes(const unsigned char *str, int maxlen)
+{
+       size_t nbytes = 0;
+       wchar_t *uni;
+
+       while (*str++) {
+               /* convert each char, find its length and add to nbytes */
+               if (char2uni(str, maxlen, uni) > 0)
+                       nbytes += strnlen(uni, NLS_MAX_CHARSET_SIZE);
+       }
+       return nbytes;
+}
+
+/*

We would still be needing the version (UniStrnlen) that returns length
in characters also.

> 
> I'm not truly convinced this is really necessary though. You have to
> figure that kmalloc is a power-of-two allocator. If you kmalloc 17
> bytes, you get 32 anyway. You'll probably end up using roughly the same
> amount of memory that you would have had you just estimated the size.
> 

Yes, it seems for most cases we would end up with roughly the same
amount of memory.

Both the approaches sounds plausible to me. If we have a concensus on
one of these two, I can convert the code in audited places and resend
the patch.

Thanks,

-- 
Suresh Jayaraman