[SCM] Samba Shared Repository - branch master updated

Andrew Tridgell tridge at samba.org
Tue Apr 12 23:32:01 MDT 2011


The branch, master has been updated
       via  887fdb7 s4-test: added a test for E_deshash()
       via  4158e9a s3-charcnv: Move convert_string() et al to lib/util/charset
       via  bf431fb libcli/auth Use convert_string_error to check LM hash calculation.
       via  d335b63 lib/util/charset Add many more charset tests
       via  748c31d lib/util/charset Add convert_string_error()
       via  8db1648 lib/util/charset Make ASCII conversion validate it's input
       via  1d4fb07 s3-selftest Add workaround for RAP test failure
       via  a2c691a lib/util/charset Rename convert_string test to allow a 'non_handle' test
       via  7bbd701 lib/util/charset Add more tests for convert_string_error_handle()
       via  9346382 lib/util/charset Preserve 'pull' errors even when converting via UTF16
       via  1efa600 lib/util/charset Add tests for convert_string_error_handle
       via  b21129a lib/util/charset Add expected values for upper/lower case tests
       via  cd63c92 lib/util/charset Fix and add public interface for convert_string_error_handle
       via  87d2722 s4/torture Fix calls to charcnv functions to always supply converted_size
       via  17ccff9 lib/util: Make string_replace from s3 common
       via  b2e37d9 lib/util ucs2_align is identical, put it in common
       via  2eea919 lib/util Move simple string routines into common code.
       via  9941dfe lib/util/charset Move source3/lib/util_unistr.c to the common code.
       via  ce2f217 s3-lib Move strdup_w to it's only user in mangle_hash.c
       via  e3138f2 s3-lib Move isvalid83_w to mangle_hash.c
       via  d458f6b s3-lib make static and remove more _w functions
       via  a82fba3 s3-lib Remove unused #define
       via  ba2b7f7 s3-lib Remove unused skip_unibuf()
       via  8fcda44 s3-lib: Remove unused _w functions.
       via  5cfb0bd s3-lib Correct comment in strlen_w()
       via  43deb97 s3-lib Remove more unused fstring.c functions
       via  c8a5fa3 s3-charcnv: make pull_ucs2 static
       via  b6a8418 s3-lib: Remove unused pull_ucs2_fstring()
      from  380bd49 build: use readelf as a replacement for ldd

http://gitweb.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 887fdb7ba126f280682699d19bcc2931e9c3602d
Author: Andrew Tridgell <tridge at samba.org>
Date:   Wed Apr 13 14:40:27 2011 +1000

    s4-test: added a test for E_deshash()
    
    this particularly checks the boundary conditions near passwords of
    length 14 characters
    
    Pair-Programmed-With: Andrew Bartlett <abartlet at samba.org>
    
    Autobuild-User: Andrew Tridgell <tridge at samba.org>
    Autobuild-Date: Wed Apr 13 07:31:55 CEST 2011 on sn-devel-104

commit 4158e9a7e59c489c90097ac10d44640ccdd4470d
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 14:01:41 2011 +1000

    s3-charcnv: Move convert_string() et al to lib/util/charset
    
    This is the first step to this being the common convert_string
    implementation.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit bf431fbedb8119b392b071f903b63e0f9671ee49
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 11:29:59 2011 +1000

    libcli/auth Use convert_string_error to check LM hash calculation.
    
    This allows us to know if the LM hash was built correctly or not.
    
    NOTE: talloc_tos() is not available in the common code at this time.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit d335b635c2a5ebd8ac5478a4293798072ac18d47
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 10:37:24 2011 +1000

    lib/util/charset Add many more charset tests
    
    This confirms that the behaviour of the convert_string() API (with the
    process-wide iconv handle).
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 748c31dc5de762942770b4cced8c1ea827d8e040
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 10:36:37 2011 +1000

    lib/util/charset Add convert_string_error()
    
    This adds an interface that matches the source3/ convert string code.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 8db1648f6644acca05ca41fd3803468bba98993d
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 10:35:43 2011 +1000

    lib/util/charset Make ASCII conversion validate it's input
    
    We should not just strip the high bits off unicode strings being
    converted to ASCII, we need to actually fail the conversion.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 1d4fb073ecd77a8289b064d4eb6bb148ba49c11b
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Wed Apr 13 11:37:33 2011 +1000

    s3-selftest Add workaround for RAP test failure
    
    The rap.sam test reads 0xFFFFFFFF as a string in the level 2
    r->HomeDir attribute, which once we start validating ASCII strings
    fails.  This restores a unchecked dos charset for this test only,
    until it is determined if the client or server RAP code is at fault.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit a2c691ab9a21946e0c6a86d11278a1a220615dc8
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Mon Apr 11 21:37:01 2011 +1000

    lib/util/charset Rename convert_string test to allow a 'non_handle' test
    
    A future commit will test (with a subset of tests) the varient of this
    function without _handle.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 7bbd701a1397924e946cd709306b96576a9f797d
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Mon Apr 11 21:36:13 2011 +1000

    lib/util/charset Add more tests for convert_string_error_handle()
    
    This helps define the semantics of this function very clearly,
    particularly for partial and invalid inputs.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 93463829afaa4768183b62f20146ef903da8cf8b
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Mon Apr 11 21:34:21 2011 +1000

    lib/util/charset Preserve 'pull' errors even when converting via UTF16
    
    When we do not have a direct iconv handle between any two charsets, we
    must go iva UTF16.  However, we should still return the same buffer
    and error code as if we were able to go direct - including the partial
    conversion and the error code.
    
    This is important for locating the invalid multibyte character in the
    stream, for example.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 1efa6001441413c29e4b85c1222a84aff7e00ae8
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Mon Apr 11 20:47:13 2011 +1000

    lib/util/charset Add tests for convert_string_error_handle
    
    These confirm that the errno is set correctly and that we stop on a
    partial multibyte character
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit b21129ae20b9d79b9481d26352130588b6ba8e1b
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Mon Apr 11 20:46:42 2011 +1000

    lib/util/charset Add expected values for upper/lower case tests
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit cd63c9205e79df250c4d4fefb35c917701fd7db6
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Mon Apr 11 20:15:53 2011 +1000

    lib/util/charset Fix and add public interface for convert_string_error_handle
    
    It makes much more sense for this to match the source3/ interface and
    return a bool.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 87d2722b84992b27b199d362ac4b9034b4697942
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 18:12:20 2011 +1000

    s4/torture Fix calls to charcnv functions to always supply converted_size
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 17ccff973a4136c6bffa2fce68eeb1be53add447
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Fri Apr 8 13:04:26 2011 +1000

    lib/util: Make string_replace from s3 common
    
    The s4 implementation didn't do multibyte strings, so was only good
    for '/' which is known to be safe in all multibyte charsets.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit b2e37d9ce12627883ff18ab22ed9d3b6233f6baf
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Fri Apr 8 12:55:28 2011 +1000

    lib/util ucs2_align is identical, put it in common
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 2eea91957c90d6a5960b5350d2c4664812260a7b
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Fri Apr 8 12:02:40 2011 +1000

    lib/util Move simple string routines into common code.
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 9941dfe9f6532ecbc317685046d74e6f90c41695
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 16:31:08 2011 +1000

    lib/util/charset Move source3/lib/util_unistr.c to the common code.
    
    This file (largely) contains functions to deal with UTF16 strings.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit ce2f217bd2402ada76c13bf3c170c8f55752fb11
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 16:55:16 2011 +1000

    s3-lib Move strdup_w to it's only user in mangle_hash.c
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit e3138f2ffef32ee33778e0c068c6009a58536419
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 16:23:17 2011 +1000

    s3-lib Move isvalid83_w to mangle_hash.c
    
    This means that there is no need for the 'valid.dat' table to be
    loaded by anything other than smbd, so the unloader is also removed.
    
    The concept of a 'valid dos character' has been replaced by the hash2
    mangle method.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit d458f6b3bd7043bb78953bdc48ee6e2dcb034042
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 16:16:22 2011 +1000

    s3-lib make static and remove more _w functions
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit a82fba349989376397dbbb07ca3212713424c411
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 16:11:13 2011 +1000

    s3-lib Remove unused #define
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit ba2b7f72c0459123c6bf88ee1c272e94dbfdcf9b
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 16:10:57 2011 +1000

    s3-lib Remove unused skip_unibuf()
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 8fcda44a1f70f0d6d0076620a672b99a2798a2f4
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 15:59:51 2011 +1000

    s3-lib: Remove unused _w functions.
    
    In general we don't manipulate UTF16 strings internally, particularly
    as they are also multibyte, so are no easier to work with than UTF8.
    
    Andrew Bartlett
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 5cfb0bdfd845b761d66a25815307b2b58293bfb8
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 15:48:43 2011 +1000

    s3-lib Correct comment in strlen_w()
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit 43deb9745b3175d070ce5c62ec6104b31e567249
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 15:39:18 2011 +1000

    s3-lib Remove more unused fstring.c functions
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit c8a5fa3fa938e635327b1d65964ba599a92f233f
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 15:25:04 2011 +1000

    s3-charcnv: make pull_ucs2 static
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

commit b6a8418ff6918e6c01d603f69e28167fbcd91dee
Author: Andrew Bartlett <abartlet at samba.org>
Date:   Tue Apr 12 15:22:12 2011 +1000

    s3-lib: Remove unused pull_ucs2_fstring()
    
    Signed-off-by: Andrew Tridgell <tridge at samba.org>

-----------------------------------------------------------------------

Summary of changes:
 lib/util/charset/charcnv.c              |   15 +-
 lib/util/charset/charset.h              |   30 ++
 lib/util/charset/convert_string.c       |  467 +++++++++++++++++++++
 lib/util/charset/iconv.c                |   84 ++++-
 lib/util/charset/tests/convert_string.c |  648 ++++++++++++++++++++++++++++-
 lib/util/charset/util_unistr.c          |   21 +
 lib/util/charset/util_unistr_w.c        |  324 +++++++++++++++
 lib/util/charset/wscript_build          |    2 +-
 lib/util/util_str.c                     |   71 ----
 lib/util/util_str_common.c              |  104 +++++
 lib/util/util_strlist.c                 |   26 ++
 lib/util/wscript_build                  |    3 +-
 libcli/auth/smbencrypt.c                |   33 ++-
 source3/Makefile.in                     |    7 +-
 source3/include/proto.h                 |   49 +--
 source3/include/smb_macros.h            |    1 -
 source3/lib/charcnv.c                   |  494 ++---------------------
 source3/lib/fstring.c                   |   50 ---
 source3/lib/netapi/netapi.c             |    1 -
 source3/lib/util.c                      |    1 -
 source3/lib/util_str.c                  |   92 -----
 source3/lib/util_unistr.c               |  679 -------------------------------
 source3/selftest/tests.py               |    2 +
 source3/smbd/mangle_hash.c              |   53 +++
 source3/wscript_build                   |    2 +-
 source4/torture/auth/smbencrypt.c       |   70 ++++
 source4/torture/local/local.c           |    3 +
 source4/torture/rpc/samba3rpc.c         |    3 +-
 source4/torture/rpc/samlogon.c          |    2 +-
 source4/torture/wscript_build           |    2 +-
 30 files changed, 1896 insertions(+), 1443 deletions(-)
 create mode 100644 lib/util/charset/convert_string.c
 create mode 100644 lib/util/charset/util_unistr_w.c
 create mode 100644 lib/util/util_str_common.c
 delete mode 100644 source3/lib/util_unistr.c
 create mode 100644 source4/torture/auth/smbencrypt.c


Changeset truncated at 500 lines:

diff --git a/lib/util/charset/charcnv.c b/lib/util/charset/charcnv.c
index cefc788..998bb08 100644
--- a/lib/util/charset/charcnv.c
+++ b/lib/util/charset/charcnv.c
@@ -124,10 +124,11 @@ convert:
  * @returns the number of bytes occupied in the destination
  * on error, returns -1, and sets errno
  **/
-_PUBLIC_ ssize_t convert_string_error_handle(struct smb_iconv_handle *ic,
-					     charset_t from, charset_t to,
-					     void const *src, size_t srclen,
-					     void *dest, size_t destlen, size_t *converted_size)
+_PUBLIC_ bool convert_string_error_handle(struct smb_iconv_handle *ic,
+					  charset_t from, charset_t to,
+					  void const *src, size_t srclen,
+					  void *dest, size_t destlen,
+					  size_t *converted_size)
 {
 	size_t i_len, o_len;
 	ssize_t retval;
@@ -154,7 +155,7 @@ _PUBLIC_ ssize_t convert_string_error_handle(struct smb_iconv_handle *ic,
 
 	if (converted_size != NULL)
 		*converted_size = destlen-o_len;
-	return retval;
+	return (retval != (ssize_t)-1);
 }
 
 
@@ -172,10 +173,10 @@ _PUBLIC_ bool convert_string_handle(struct smb_iconv_handle *ic,
 					 void const *src, size_t srclen,
 					 void *dest, size_t destlen, size_t *converted_size)
 {
-	ssize_t retval;
+	bool retval;
 
 	retval = convert_string_error_handle(ic, from, to, src, srclen, dest, destlen, converted_size);
-	if(retval==(size_t)-1) {
+	if(retval==false) {
 	    	const char *reason;
 		switch(errno) {
 		case EINVAL:
diff --git a/lib/util/charset/charset.h b/lib/util/charset/charset.h
index 16bb9c6..1078035 100644
--- a/lib/util/charset/charset.h
+++ b/lib/util/charset/charset.h
@@ -174,6 +174,10 @@ bool convert_string(charset_t from, charset_t to,
 		      void const *src, size_t srclen, 
 		      void *dest, size_t destlen,
 		      size_t *converted_size);
+bool convert_string_error(charset_t from, charset_t to,
+			  void const *src, size_t srclen,
+			  void *dest, size_t destlen,
+			  size_t *converted_size);
 
 ssize_t iconv_talloc(TALLOC_CTX *mem_ctx, 
 				       smb_iconv_t cd,
@@ -222,6 +226,12 @@ bool convert_string_handle(struct smb_iconv_handle *ic,
 				charset_t from, charset_t to,
 				void const *src, size_t srclen, 
 				void *dest, size_t destlen, size_t *converted_size);
+bool convert_string_error_handle(struct smb_iconv_handle *ic,
+				 charset_t from, charset_t to,
+				 void const *src, size_t srclen,
+				 void *dest, size_t destlen,
+				 size_t *converted_size);
+
 bool convert_string_talloc_handle(TALLOC_CTX *ctx,
 				       struct smb_iconv_handle *ic,
 				       charset_t from, charset_t to, 
@@ -240,6 +250,26 @@ void load_case_tables(void);
 void load_case_tables_library(void);
 bool smb_register_charset(const struct charset_functions *funcs_in);
 
+/* The following definitions come from util_unistr_w.c  */
+
+size_t strlen_w(const smb_ucs2_t *src);
+size_t strnlen_w(const smb_ucs2_t *src, size_t max);
+smb_ucs2_t *strchr_w(const smb_ucs2_t *s, smb_ucs2_t c);
+smb_ucs2_t *strchr_wa(const smb_ucs2_t *s, char c);
+smb_ucs2_t *strrchr_w(const smb_ucs2_t *s, smb_ucs2_t c);
+smb_ucs2_t *strnrchr_w(const smb_ucs2_t *s, smb_ucs2_t c, unsigned int n);
+smb_ucs2_t *strstr_w(const smb_ucs2_t *s, const smb_ucs2_t *ins);
+bool strlower_w(smb_ucs2_t *s);
+bool strupper_w(smb_ucs2_t *s);
+int strcmp_w(const smb_ucs2_t *a, const smb_ucs2_t *b);
+int strcasecmp_w(const smb_ucs2_t *a, const smb_ucs2_t *b);
+int strncasecmp_w(const smb_ucs2_t *a, const smb_ucs2_t *b, size_t len);
+int strcmp_wa(const smb_ucs2_t *a, const char *b);
+int toupper_ascii(int c);
+int tolower_ascii(int c);
+int isupper_ascii(int c);
+int islower_ascii(int c);
+
 /*
  *   Define stub for charset module which implements 8-bit encoding with gaps.
  *   Encoding tables for such module should be produced from glibc's CHARMAPs
diff --git a/lib/util/charset/convert_string.c b/lib/util/charset/convert_string.c
new file mode 100644
index 0000000..86bb625
--- /dev/null
+++ b/lib/util/charset/convert_string.c
@@ -0,0 +1,467 @@
+/*
+   Unix SMB/CIFS implementation.
+   Character set conversion Extensions
+   Copyright (C) Igor Vergeichik <iverg at mail.ru> 2001
+   Copyright (C) Andrew Tridgell 2001
+   Copyright (C) Simo Sorce 2001
+   Copyright (C) Martin Pool 2003
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+*/
+#include "includes.h"
+
+/**
+ * @file
+ *
+ * @brief Character-set conversion routines built on our iconv.
+ *
+ * @note Samba's internal character set (at least in the 3.0 series)
+ * is always the same as the one for the Unix filesystem.  It is
+ * <b>not</b> necessarily UTF-8 and may be different on machines that
+ * need i18n filenames to be compatible with Unix software.  It does
+ * have to be a superset of ASCII.  All multibyte sequences must start
+ * with a byte with the high bit set.
+ *
+ * @sa lib/iconv.c
+ */
+
+
+/**
+ * Convert string from one encoding to another, making error checking etc
+ * Slow path version - uses (slow) iconv.
+ *
+ * @param src pointer to source string (multibyte or singlebyte)
+ * @param srclen length of the source string in bytes
+ * @param dest pointer to destination string (multibyte or singlebyte)
+ * @param destlen maximal length allowed for string
+ * @param converted size is the number of bytes occupied in the destination
+ *
+ * @returns false and sets errno on fail, true on success.
+ *
+ * Ensure the srclen contains the terminating zero.
+ *
+ **/
+
+static bool convert_string_internal(charset_t from, charset_t to,
+		      void const *src, size_t srclen,
+		      void *dest, size_t destlen, size_t *converted_size)
+{
+	size_t i_len, o_len;
+	size_t retval;
+	const char* inbuf = (const char*)src;
+	char* outbuf = (char*)dest;
+	smb_iconv_t descriptor;
+	struct smb_iconv_handle *ic;
+
+	lazy_initialize_conv();
+	ic = get_iconv_handle();
+	descriptor = get_conv_handle(ic, from, to);
+
+	if (srclen == (size_t)-1) {
+		if (from == CH_UTF16LE || from == CH_UTF16BE) {
+			srclen = (strlen_w((const smb_ucs2_t *)src)+1) * 2;
+		} else {
+			srclen = strlen((const char *)src)+1;
+		}
+	}
+
+
+	if (descriptor == (smb_iconv_t)-1 || descriptor == (smb_iconv_t)0) {
+		errno = EINVAL;
+		return false;
+	}
+
+	i_len=srclen;
+	o_len=destlen;
+
+	retval = smb_iconv(descriptor, &inbuf, &i_len, &outbuf, &o_len);
+	if (retval == (size_t)-1) {
+		return false;
+	}
+	*converted_size = destlen-o_len;
+	return true;
+}
+
+/**
+ * Convert string from one encoding to another, making error checking etc
+ * Fast path version - handles ASCII first.
+ *
+ * @param src pointer to source string (multibyte or singlebyte)
+ * @param srclen length of the source string in bytes, or -1 for nul terminated.
+ * @param dest pointer to destination string (multibyte or singlebyte)
+ * @param destlen maximal length allowed for string - *NEVER* -1.
+ * @param converted size is the number of bytes occupied in the destination
+ *
+ * @returns false and sets errno on fail, true on success.
+ *
+ * Ensure the srclen contains the terminating zero.
+ *
+ * This function has been hand-tuned to provide a fast path.
+ * Don't change unless you really know what you are doing. JRA.
+ **/
+
+bool convert_string_error(charset_t from, charset_t to,
+			    void const *src, size_t srclen,
+			    void *dest, size_t destlen,
+			    size_t *converted_size)
+{
+	/*
+	 * NB. We deliberately don't do a strlen here if srclen == -1.
+	 * This is very expensive over millions of calls and is taken
+	 * care of in the slow path in convert_string_internal. JRA.
+	 */
+
+#ifdef DEVELOPER
+	SMB_ASSERT(destlen != (size_t)-1);
+#endif
+
+	if (srclen == 0) {
+		*converted_size = 0;
+		return true;
+	}
+
+	if (from != CH_UTF16LE && from != CH_UTF16BE && to != CH_UTF16LE && to != CH_UTF16BE) {
+		const unsigned char *p = (const unsigned char *)src;
+		unsigned char *q = (unsigned char *)dest;
+		size_t slen = srclen;
+		size_t dlen = destlen;
+		unsigned char lastp = '\0';
+		size_t retval = 0;
+
+		/* If all characters are ascii, fast path here. */
+		while (slen && dlen) {
+			if ((lastp = *p) <= 0x7f) {
+				*q++ = *p++;
+				if (slen != (size_t)-1) {
+					slen--;
+				}
+				dlen--;
+				retval++;
+				if (!lastp)
+					break;
+			} else {
+#ifdef BROKEN_UNICODE_COMPOSE_CHARACTERS
+				goto general_case;
+#else
+				bool ret = convert_string_internal(from, to, p, slen, q, dlen, converted_size);
+				*converted_size += retval;
+				return ret;
+#endif
+			}
+		}
+
+		*converted_size = retval;
+
+		if (!dlen) {
+			/* Even if we fast path we should note if we ran out of room. */
+			if (((slen != (size_t)-1) && slen) ||
+					((slen == (size_t)-1) && lastp)) {
+				errno = E2BIG;
+				return false;
+			}
+		}
+		return true;
+	} else if (from == CH_UTF16LE && to != CH_UTF16LE) {
+		const unsigned char *p = (const unsigned char *)src;
+		unsigned char *q = (unsigned char *)dest;
+		size_t retval = 0;
+		size_t slen = srclen;
+		size_t dlen = destlen;
+		unsigned char lastp = '\0';
+
+		/* If all characters are ascii, fast path here. */
+		while (((slen == (size_t)-1) || (slen >= 2)) && dlen) {
+			if (((lastp = *p) <= 0x7f) && (p[1] == 0)) {
+				*q++ = *p;
+				if (slen != (size_t)-1) {
+					slen -= 2;
+				}
+				p += 2;
+				dlen--;
+				retval++;
+				if (!lastp)
+					break;
+			} else {
+#ifdef BROKEN_UNICODE_COMPOSE_CHARACTERS
+				goto general_case;
+#else
+				bool ret = convert_string_internal(from, to, p, slen, q, dlen, converted_size);
+				*converted_size += retval;
+				return ret;
+#endif
+			}
+		}
+
+		*converted_size = retval;
+
+		if (!dlen) {
+			/* Even if we fast path we should note if we ran out of room. */
+			if (((slen != (size_t)-1) && slen) ||
+					((slen == (size_t)-1) && lastp)) {
+				errno = E2BIG;
+				return false;
+			}
+		}
+		return true;
+	} else if (from != CH_UTF16LE && from != CH_UTF16BE && to == CH_UTF16LE) {
+		const unsigned char *p = (const unsigned char *)src;
+		unsigned char *q = (unsigned char *)dest;
+		size_t retval = 0;
+		size_t slen = srclen;
+		size_t dlen = destlen;
+		unsigned char lastp = '\0';
+
+		/* If all characters are ascii, fast path here. */
+		while (slen && (dlen >= 2)) {
+			if ((lastp = *p) <= 0x7F) {
+				*q++ = *p++;
+				*q++ = '\0';
+				if (slen != (size_t)-1) {
+					slen--;
+				}
+				dlen -= 2;
+				retval += 2;
+				if (!lastp)
+					break;
+			} else {
+#ifdef BROKEN_UNICODE_COMPOSE_CHARACTERS
+				goto general_case;
+#else
+				bool ret = convert_string_internal(from, to, p, slen, q, dlen, converted_size);
+				*converted_size += retval;
+				return ret;
+#endif
+			}
+		}
+
+		*converted_size = retval;
+
+		if (!dlen) {
+			/* Even if we fast path we should note if we ran out of room. */
+			if (((slen != (size_t)-1) && slen) ||
+					((slen == (size_t)-1) && lastp)) {
+				errno = E2BIG;
+				return false;
+			}
+		}
+		return true;
+	}
+
+#ifdef BROKEN_UNICODE_COMPOSE_CHARACTERS
+  general_case:
+#endif
+	return convert_string_internal(from, to, src, srclen, dest, destlen, converted_size);
+}
+
+bool convert_string(charset_t from, charset_t to,
+		      void const *src, size_t srclen,
+		      void *dest, size_t destlen,
+		      size_t *converted_size)
+{
+	bool ret = convert_string_error(from, to, src, srclen, dest, destlen, converted_size);
+
+	if(ret==false) {
+		const char *reason="unknown error";
+		switch(errno) {
+			case EINVAL:
+				reason="Incomplete multibyte sequence";
+				DEBUG(3,("convert_string_internal: Conversion error: %s(%s)\n",
+					 reason, (const char *)src));
+				break;
+			case E2BIG:
+			{
+				struct smb_iconv_handle *ic;
+				lazy_initialize_conv();
+				ic = get_iconv_handle();
+
+				reason="No more room";
+				if (from == CH_UNIX) {
+					DEBUG(3,("E2BIG: convert_string(%s,%s): srclen=%u destlen=%u - '%s'\n",
+						 charset_name(ic, from), charset_name(ic, to),
+						 (unsigned int)srclen, (unsigned int)destlen, (const char *)src));
+				} else {
+					DEBUG(3,("E2BIG: convert_string(%s,%s): srclen=%u destlen=%u\n",
+						 charset_name(ic, from), charset_name(ic, to),
+						 (unsigned int)srclen, (unsigned int)destlen));
+				}
+				break;
+			}
+			case EILSEQ:
+				reason="Illegal multibyte sequence";
+				DEBUG(3,("convert_string_internal: Conversion error: %s(%s)\n",
+					 reason, (const char *)src));
+				break;
+			default:
+				DEBUG(0,("convert_string_internal: Conversion error: %s(%s)\n",
+					 reason, (const char *)src));
+				break;
+		}
+		/* smb_panic(reason); */
+	}
+	return ret;
+}
+
+
+/**
+ * Convert between character sets, allocating a new buffer using talloc for the result.
+ *
+ * @param srclen length of source buffer.
+ * @param dest always set at least to NULL
+ * @parm converted_size set to the number of bytes occupied by the string in
+ * the destination on success.
+ * @note -1 is not accepted for srclen.
+ *
+ * @return true if new buffer was correctly allocated, and string was
+ * converted.
+ *
+ * Ensure the srclen contains the terminating zero.
+ *
+ * I hate the goto's in this function. It's embarressing.....
+ * There has to be a cleaner way to do this. JRA.
+ */
+bool convert_string_talloc(TALLOC_CTX *ctx, charset_t from, charset_t to,
+			   void const *src, size_t srclen, void *dst,
+			   size_t *converted_size)
+
+{
+	size_t i_len, o_len, destlen = (srclen * 3) / 2;
+	size_t retval;
+	const char *inbuf = (const char *)src;
+	char *outbuf = NULL, *ob = NULL;
+	smb_iconv_t descriptor;
+	void **dest = (void **)dst;
+	struct smb_iconv_handle *ic;
+
+	*dest = NULL;
+
+	if (src == NULL || srclen == (size_t)-1) {
+		errno = EINVAL;
+		return false;
+	}
+
+	if (srclen == 0) {
+		/* We really should treat this as an error, but
+		   there are too many callers that need this to
+		   return a NULL terminated string in the correct
+		   character set. */
+		if (to == CH_UTF16LE|| to == CH_UTF16BE || to == CH_UTF16MUNGED) {
+			destlen = 2;
+		} else {
+			destlen = 1;
+		}
+		ob = talloc_zero_array(ctx, char, destlen);
+		if (ob == NULL) {
+			errno = ENOMEM;
+			return false;
+		}
+		*converted_size = destlen;
+		*dest = ob;
+		return true;
+	}
+
+	lazy_initialize_conv();
+	ic = get_iconv_handle();
+	descriptor = get_conv_handle(ic, from, to);
+
+	if (descriptor == (smb_iconv_t)-1 || descriptor == (smb_iconv_t)0) {
+		DEBUG(0,("convert_string_talloc: Conversion not supported.\n"));
+		errno = EOPNOTSUPP;
+		return false;
+	}
+
+  convert:
+
+	/* +2 is for ucs2 null termination. */
+	if ((destlen*2)+2 < destlen) {
+		/* wrapped ! abort. */
+		DEBUG(0, ("convert_string_talloc: destlen wrapped !\n"));
+		TALLOC_FREE(outbuf);
+		errno = EOPNOTSUPP;
+		return false;
+	} else {
+		destlen = destlen * 2;
+	}
+
+	/* +2 is for ucs2 null termination. */


-- 
Samba Shared Repository


More information about the samba-cvs mailing list