[PATCH] RFC: shell-safe version of string_sub()?

Peter Samuelson peter at cadcamlab.org
Wed Oct 25 21:43:25 GMT 2000


> Also, it's a very good idea to use "$@" in [Bourne, Korn, Bash] shell
> scripts when referencing the script's [or function's] arguments. Few
> people know the difference between "$@" and "$*" and $*...

Yup, that's the biggest offender I see: misuse of $*.

> E.g., 'for i in $*' is _bad_, use 'for i in "$@"' instead...

Little-known fact: if you omit the "in" part, it defaults to "$@".  So
'for i; do ... done' is short for 'for i in "$@"; do ... done'.  I'm
not sure if this is true in early Bourne shells, but it works in any
modern one.

> I was thinking more about UTF-8. Unix syscalls all use char * for
> string arguments; where they use unsigned char * UTF-8 can be used

OK, well, I don't know the details of UTF-8 although I think I know the
basic idea (variable-length encoding, like Huffman compression).
However, unless it uses things like \0 (which you say it doesn't), I
believe it should be safe for sh_string_sub().  Assuming your shell
doesn't have a problem with 8-bit characters, which it might, I haven't
checked.  (bash, as an example, has two input modes in interactive
mode: either it can register 8-bit characters as themselves, or as key
sequences involving 'meta' keys.  I assume that in non-interactive mode
it always inputs them as regular 8-bit characters ... but I'm not
sure.)

Peter




More information about the samba-technical mailing list