utf8 and one character code pages in SAMBA_2_2

Michael Adda michael at disksites.com
Tue Jul 24 10:26:38 GMT 2001


Hi all,

I had some problems using samba 2.2.x with utf8 coding system and code page 862.
It seem that when utf8 is converted into the code page string, it is assumed
that the code page always consists of two character. In the 862 (Hebrew) code
page this is not true, therefore a null character is inserted instead of putting
just one character. This obviously ends the null terminated string.
I've changed the code to selectively add characters to the string, and the diff
is below.
After applying this fix, I've been able to use utf8 as the underlying string
format.
I was not able to verify that this does not break things with other code pages
as I do not have any clients using them, so any feedback is welcomed.

Michael Adda,
Disksites R&D <http://www.disksites.com>

Index: kanji.c
===================================================================
RCS file: /cvsroot/samba/source/lib/Attic/kanji.c,v
retrieving revision 1.23.4.4
diff -c -r1.23.4.4 kanji.c
*** kanji.c     26 Mar 2001 23:12:35 -0000      1.23.4.4
--- kanji.c     24 Jul 2001 09:37:39 -0000
***************
*** 1413,1426 ****
      } else if ((0xc0 <= val) && (val <= 0xdf) 
               && (0x80 <= *src) && (*src <= 0xbf)) {
        w = ucs2doscp( ((val & 31) << 6)  | ((*src++) & 63 ));
!       *dst++ = (char)((w >> 8) & 0xff);
        *dst++ = (char)(w & 0xff);
      } else {
        val  = (val & 0x0f) << 12;
        val |= ((*src++ & 0x3f) << 6);
        val |= (*src++ & 0x3f);
        w = ucs2doscp(val);
!       *dst++ = (char)((w >> 8) & 0xff);
        *dst++ = (char)(w & 0xff);
      }
    }
--- 1413,1428 ----
      } else if ((0xc0 <= val) && (val <= 0xdf) 
               && (0x80 <= *src) && (*src <= 0xbf)) {
        w = ucs2doscp( ((val & 31) << 6)  | ((*src++) & 63 ));
!       if ((w >> 8) & 0xff)
!         *dst++ = (char)((w >> 8) & 0xff);
        *dst++ = (char)(w & 0xff);
      } else {
        val  = (val & 0x0f) << 12;
        val |= ((*src++ & 0x3f) << 6);
        val |= (*src++ & 0x3f);
        w = ucs2doscp(val);
!       if ((w >> 8) & 0xff)
!         *dst++ = (char)((w >> 8) & 0xff);
        *dst++ = (char)(w & 0xff);
      }
    }





More information about the samba-technical mailing list