skip_multibyte_char()

Jeremy Allison jallison at whistle.com
Tue Apr 7 22:47:34 GMT 1998


Christopher R. Hertel wrote:

> Almost done with mangle.c.  I have a question:  Can mangled names ever
> contain multi-byte characters?
> 

I think they can. One character that will never be in a multibyte
character though is the dot '.' character - all the second bytes of
multibyte characters start above 0x40 (dot is 0x2e). But there's
nothing to stop the filename extension being multibyte. But this
allows you to unambiguously find the dot at the start of an
extension without having to do a multibyte string walk.

However, just treating the multibyte extension as 'n'
bytes also works - as the 8.3 extension is on *bytes*,
not characters.

> In the original, Andrew does a simple test:  If there is an extension
> on the *long* name and it is less than three (lower case) characters
> long, then he strips off the extension.  In the reverse mapping, then,
> he can reverse map a base name and add the short (dos-approved)
> extension back on.  The result is that the cache contains a "group"
> reverse map, which can be used for a whole set of names (eg.
> 'mangled.*').
> 
>   p = strrchr( mangled_stack[0], '.' );
>   if( p && (!strhasupper(p+1)) && (strlen(p+1) < (size_t)4) )
>     *p = 0;
> 
> Anyway, I'm being a tiny bit more careful, and saying that the
> extension of the mangled name (converted to lower case) *must match*
> the extension of the long name.  This is correct, basically, as the
> original test (above) implies the same thing, but does not test it
> explicitly.
> 
> Anyway, I just want to avoid being bitten by multi-byte characters, if
> they even *are* an issue (don't think so, but I always like to check).
> 

Even if a long filename does contain them, all you need
to do is call skip_multibyte_character() when walking
a filename - this essentially returns either zero, if
the current codepage contains no multibyte characters,
or the length of the next character (in bytes) in the 
current selected codepage, if that codepage does
contain multibyte characters.

Jeremy.

-- 
--------------------------------------------------------
Buying an operating system without source is like buying
a self-assembly Space Shuttle with no instructions.
--------------------------------------------------------


More information about the samba-technical mailing list