[linux-cifs-client] [PATCH 01/10] cifs: add function to get length of NULL termination in bytes

Wed Apr 29 14:12:58 GMT 2009

On Wed, 29 Apr 2009 08:58:22 -0500
Shirish Pargaonkar <shirishpargaonkar at gmail.com> wrote:

> On Wed, Apr 29, 2009 at 8:29 AM, Jeff Layton <jlayton at redhat.com> wrote:
> > It's possible to have the null terminator for a charset be a single or
> > multiple character. Add a function to tell us how long it should be.
> >
> > Signed-off-by: Jeff Layton <jlayton at redhat.com>
> > ---
> >  fs/cifs/cifs_unicode.h |   19 +++++++++++++++++++
> >  1 files changed, 19 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
> > index 14eb9a2..6bffab5 100644
> > --- a/fs/cifs/cifs_unicode.h
> > +++ b/fs/cifs/cifs_unicode.h
> > @@ -64,6 +64,25 @@ int cifs_strtoUCS(__le16 *, const char *, int, const struct nls_table *);
> >  #endif
> >
> >  /*
> > + * null_charlen - return length of null character for codepage
> > + * @codepage - codepage for which to return length of NULL terminator
> > + *
> > + * Since we can't guarantee that the null terminator will be a particular
> > + * length, we have to check against the codepage. If there's a problem
> > + * determining it, assume a single-byte NULL terminator.
> > + */
> > +static inline int
> > +null_charlen(const struct nls_table *codepage)
> > +{
> > +       int charlen;
> > +       char tmp[NLS_MAX_CHARSET_SIZE];
> > +
> > +       charlen = codepage->uni2char(0, tmp, NLS_MAX_CHARSET_SIZE);
> > +
> > +       return charlen > 0 ? charlen : 1;
> > +}
> > +
> > +/*
> >  * UniStrcat:  Concatenate the second string to the first
> >  *
> >  * Returns:
> > --
> > 1.6.0.6
> >
> > _______________________________________________
> > linux-cifs-client mailing list
> > linux-cifs-client at lists.samba.org
> > https://lists.samba.org/mailman/listinfo/linux-cifs-client
> >
> 
> For some of the charsets I looked at under fs/nls, it looks like uni2char
> always returns 1, I think to indicate the function succeeded as opposed
> to sending an error.
> Are there any charsets that you might have looked at whose
> uni2char function returns more than 1 byte as size of the null character?

No, I haven't see any, but I didn't do an exhaustive search. Given the
number of problems we've had in this area, I'm leery of making any
assumptions about these lengths. It's also possible that at some point
in the future we could have an in-kernel version of UTF-16 or UTF-32.
In the event of that we'll need to deal with multibyte null termination.

So I think it makes sense to use a helper function for determining this
rather than sprinkling "+1" to lengths all over the code. The overhead
looks pretty minimal anyway.

-- 
Jeff Layton <jlayton at redhat.com>