[PATCH] RFC: shell-safe version of string_sub()?

Steve Langasek vorlon at netexpress.net
Wed Oct 25 22:22:28 GMT 2000

On Wed, 25 Oct 2000, Peter Samuelson wrote:

> > I was thinking more about UTF-8. Unix syscalls all use char * for
> > string arguments; where they use unsigned char * UTF-8 can be used

> OK, well, I don't know the details of UTF-8 although I think I know the
> basic idea (variable-length encoding, like Huffman compression).
> However, unless it uses things like \0 (which you say it doesn't), I
> believe it should be safe for sh_string_sub().  Assuming your shell
> doesn't have a problem with 8-bit characters, which it might, I haven't
> checked.  (bash, as an example, has two input modes in interactive
> mode: either it can register 8-bit characters as themselves, or as key
> sequences involving 'meta' keys.  I assume that in non-interactive mode
> it always inputs them as regular 8-bit characters ... but I'm not
> sure.)

UTF-8 is explicitly designed to be backwards-compatible with
8-bit ASCII string-handling functions. No problems with \0, and most control
characters are also left out so they never represent anything other than

Steve Langasek
postmodern programmer

More information about the samba-technical mailing list