[PATCH v2] Convert properly UTF-8 to UTF-16

Suresh Jayaraman sjayaraman at suse.com
Mon Oct 8 22:43:38 MDT 2012


On 10/08/2012 01:48 PM, Frediano Ziglio wrote:
> On Wed, 2012-10-03 at 14:49 -0500, Steve French wrote:
>> Merged - but doesn't the reverse also have to be added in cifs_from_utf16?  ie
>>
>>           utf16s_to_utf8s(uni, ... );
>>
> 
> Not strictly necessary, at least to be able to mount shares.
> 
>> I am glad that someone added these multiword handling routines into
>> the kernel for FAT - this has been something we have wanted for a long
>> time in cifs (and smb2/smb3).  Note the comment in
>> fs/cifs/cifs_unicode.c
>>
>> / * Note that some windows versions actually send multiword UTF-16 characters
>>  * instead of straight UTF16-2. The linux nls routines however aren't able to
>>  * deal with those characters properly. In the event that we get some of
>>  * those characters, they won't be translated properly.
>>  */
>> int
>> cifs_from_utf16(char *to, const __le16 *from, int tolen, int fromlen,
>>                  const struct nls_table *codepage, bool mapchar)
>>
> 
> Should not be UCS-2 instead of UTF16-2 ??
> 
>>
>> We could really use some nls test cases for cifs/smb2/smb3/nfs4 which
>> basically did various file, directory, symlink create/rename/delete
>> operations with various hard to map characters so we can test copying
>> to and from the server and ensure that we get the name mappings right
>> for these (and don't ever regress).   Fortunately smb2/smb3 is only
>> unicode so we don't have to deal with mappings to other codepages from
>> utf8
>>
> 
> Do you have some framework/hook to put these tests ?
> 

I recently wrote cifstests to primarily provide a basic infrastructure
for adding regression tests for cifs. It's written in python and the
plan to be able to use python or C bindings for python. You might
consider adding tests to it.

   https://github.com/sureshjayaram/cifstests


Thanks
Suresh


More information about the samba-technical mailing list