[linux-cifs-client] [PATCH 1/3] cifs: Introduce helper to compute length of nls string in bytes

Jeff Layton jlayton at redhat.com
Fri Apr 24 21:27:15 GMT 2009


On Fri, 24 Apr 2009 11:59:54 -0500
Shirish Pargaonkar <shirishpargaonkar at gmail.com> wrote:

> On Fri, Apr 24, 2009 at 11:57 AM, Shirish Pargaonkar
> <shirishpargaonkar at gmail.com> wrote:
> > On Thu, Apr 23, 2009 at 12:56 AM, Jeff Layton <jlayton at redhat.com> wrote:
> >> On Thu, 23 Apr 2009 02:49:21 +0200
> >> Günter Kukkukk <linux at kukkukk.com> wrote:
> >>
> >>> just some further notes.
> >>> With "it's heavily used" i didn't mean the number of callers using this
> >>> function (only 1 in readdir.c) - i meant "the number of times" cifs_convertUCSpath()
> >>> is called in daily usage.... (readdir results)
> >>>
> >>> The current focus was mostly on cifs_strfromUCS_le() - but the _same_ applies
> >>> to cifs_convertUCSpath()!
> >>>
> >>> See the following code snippet:
> >>>
> >>> readdir.c --> static int cifs_get_name_from_search_buf()
> >>> ....
> >>>
> >>>       if (unicode) {
> >>>               /* BB fixme - test with long names */
> >>>               /* Note converted filename can be longer than in unicode */
> >>>               if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR)
> >>>                       pqst->len = cifs_convertUCSpath((char *)pqst->name,
> >>>                                       (__le16 *)filename, len/2, nlt);
> >>>               else
> >>>                       pqst->len = cifs_strfromUCS_le((char *)pqst->name,
> >>>                                       (__le16 *)filename, len/2, nlt);
> >>>
> >>> ....
> >>
> >> I see what you mean. Good catch. That function also has broken buffer
> >> length checking logic too.
> >>
> >> This patch is only compile-tested, but it should fix those problems. In
> >> the long run, we probably need to make all of these functions take an
> >> argument with the length of the destination buffer.
> >>
> >> Let's plan that overhaul after Suresh's latest set goes in though.
> >>
> >> --
> >> Jeff Layton <jlayton at redhat.com>
> >>
> >> _______________________________________________
> >> linux-cifs-client mailing list
> >> linux-cifs-client at lists.samba.org
> >> https://lists.samba.org/mailman/listinfo/linux-cifs-client
> >>
> >>
> >
> > A general question, the functions such as cifs_strtoUCS call uni2char
> > which assumes UTF-8 translation format.
> > If one of the characaters being encoded happens to be 6 bytes long,
> > will a SMB/CIFS server be able
> > to handle that i.e. if it is expecting a UCS-2LE encoding, thus a two
> > byte encoded value, (how) would it handle
> > 6 byte encoded value!
> >
> 
> Sorry, I meant to say
>  'char2uni which assumes UTF-8 translation format'
> and not
>  'uni2char which assumes UTF-8 translation format'

My understanding is that the unicode spec allows for a character to
translate to a wide char of up to 6 bytes. According to Suresh's
earlier email though, the unicode standard specifies no characters
above 0x10ffff. So Unicode characters can only be up to four bytes long
in UTF-8 (and maybe even only 3 bytes unless I'm missing something).

The question of course is, what if the client is using some other
non-UTF8 multibyte charset? Could we end up with chars that are 5 or 6
bytes in that case?

-- 
Jeff Layton <jlayton at redhat.com>


More information about the linux-cifs-client mailing list