[PATCH] speed up check_dos_char

Martin Pool mbp at samba.org
Fri Apr 4 06:19:32 GMT 2003


check_dos_char is called a *lot* and is very slow in 3.0/head.

This patch changes it to the same design as toupper, etc: calculate a
lookup table at startup based on configured parameters, and then
consult that for each call. 

I just realized that in fact it needs to recalculate when samba is
reconfigured, but I thought I'd throw out the patch and see if there
were any other comments.

I checked with a simple test case that this does not change the
behaviour at least for the default codepage/charset.


--- util_unistr.c.~1.92.2.7.~	2003-04-04 13:29:44.000000000 +1000
+++ util_unistr.c	2003-04-04 15:42:18.000000000 +1000
@@ -31,6 +31,14 @@ static smb_ucs2_t *upcase_table;
 static smb_ucs2_t *lowcase_table;
 static uint8 *valid_table;
 
+/**
+ * This table says which Unicode characters are valid dos
+ * characters.
+ *
+ * Each value is just a single bit.
+ **/
+static uint8 doschar_table[8192]; /* 65536 characters / 8 bits/byte */
+
 
 /*******************************************************************
 load the case handling tables
@@ -85,6 +93,21 @@ void load_case_tables(void)
 */
 int check_dos_char(smb_ucs2_t c)
 {
+	static int initialized = False;
+
+	if (!initialized) {
+		initialized = True;
+		init_doschar_table();
+	}
+	
+	/* Find the right byte, and right bit within the byte; return
+	 * 1 or 0 */
+	return (doschar_table[(c & 0xffff) / 8] & (1 << (c & 7))) != 0;
+}
+
+
+static int check_dos_char_slowly(smb_ucs2_t c)
+{
 	char buf[10];
 	smb_ucs2_t c2 = 0;
 	int len1, len2;
@@ -95,6 +118,31 @@ int check_dos_char(smb_ucs2_t c)
 	return (c == c2);
 }
 
+
+/**
+ * Fill out doschar table the hard way, by examining each character
+ **/
+void init_doschar_table(void)
+{
+	int i, j, byteval;
+
+	/* For each byte of packed table */
+	
+	for (i = 0; i <= 0xffff; i += 8) {
+		byteval = 0;
+		for (j = 0; j <= 7; j++) {
+			smb_ucs2_t c;
+
+			c = i + j;
+			
+			if (check_dos_char_slowly(c))
+				byteval |= 1 << j;
+		}
+		doschar_table[i/8] = byteval;
+	}
+}
+
+
 /**
  * Load the valid character map table from <tt>valid.dat</tt> or
  * create from the configured codepage.


-- 
Martin 




More information about the samba-technical mailing list