Mac OS X - compilation experiences and issues

Benjamin Riefenstahl Benjamin.Riefenstahl at epost.de
Fri Sep 12 13:25:26 GMT 2003


Hi Jeremy,


Sorry to be persistent.  I did a quick code review, to see how much of
a problem I actually have here.  I thought at the least I should tell
you about the results.  I think I have also noted two bugs, if you are
interested.


Premises
--------

The problem exists in those places, were the fast-path code does a
part of the work and than, when it encounters 8-bit chars, it hands
over to the generic, 8-bit-clean code, trying to keep the work that it
has already done.  There is no problem, if the 8-bit code restarts the
whole operation, the problem only exists if part of the work is done
by the fast-path and a later part by the 8-bit code.

Investigation and proposed changes
----------------------------------

I have investigated all functions that call convert_string(),
push_ucs2() and/or pull_ucs2().  There seem to be only three functions
which fit the above description, all the others that I have seen
either don't have a fast-path part at all, or they do a complete
restart for 8-bit.  The "problematic" functions are convert_string()
(lib/charcnv.c), string_replace() (lib/util_str.c) and strchr_m()
(lib/util_str.c).

All these functions can be converted to use a complete restart for
8-bit.  That costs a few cycles more for 8-bit strings, but as in the
8-bit case a number of functions are called anyway to get the result,
I don't think that is much of a problem.

Note that the changes I propose do not significantly complicate the
code IMO.  It's mostly just a question of using copies of the input
parameters in the fast-path code instead of changing the inputs.

Problems remaining
------------------

One problem is in string_replace().  It changes its input in-place, so
it could in theory change the first byte of a character combination
before handing over to the 8-bit code.  Luckily it is always called
with literal characters as input, so I could easily check this.  Most
of its calls are for '/', '\\', '"', which are no problem, because
these characters don't occur in pre-combined characters.  There is one
case of converting a space.  Space of course does occur in
compositions.  But this call is for SMB service names, which we can
probably assume to use ASCII only.

Another potential problem: convert_string() could be used with input
and output pointing to the same memory, which would result in
destroying the input before the restart.  If that actually occurs in
the code, the fast-path code could be disabled for this case by adding
another check to the if/else chain.  Of course such usage can also
just be made forbidden, in that case we can just add an appropiate
assert().

Bugs
----

Probable bugs that I have seen:

- convert_string() doesn't fixup srclen in the generic part (below the
  fast-path).  Doing this as a first thing before the fast-path code
  also means you don't have to check for srclen==-1 at every step in
  that code.

- I think the check for 8-bit in the while condition in strchr_m()
  should be negated, i.e. you want 

      while (!at_end && !8_bit)
          fast_path


A patch is attached, in case you are interested.  I tested this
quickly and it works here with my test cases.


benny


-------------- next part --------------
A non-text attachment was scrubbed...
Name: samba-fast-path.patch
Type: text/x-patch
Size: 4495 bytes
Desc: not available
Url : http://lists.samba.org/archive/samba-technical/attachments/20030912/27d8822e/samba-fast-path.bin


More information about the samba-technical mailing list