[linux-cifs-client] [PATCH 03/10] cifs: add replacement for
cifs_strtoUCS_le called cifs_utf16le_to_host
Shirish Pargaonkar
shirishpargaonkar at gmail.com
Wed Apr 29 15:26:40 GMT 2009
On Wed, Apr 29, 2009 at 8:29 AM, Jeff Layton <jlayton at redhat.com> wrote:
> Add a replacement function for cifs_strtoUCS_le. cifs_utf16le_to_host
> takes args for the source and destination length so that we can ensure
> that the function is confined within the intended buffers.
>
> Signed-off-by: Jeff Layton <jlayton at redhat.com>
> ---
> fs/cifs/cifs_unicode.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++++
> fs/cifs/cifs_unicode.h | 2 +
> 2 files changed, 123 insertions(+), 0 deletions(-)
>
> diff --git a/fs/cifs/cifs_unicode.c b/fs/cifs/cifs_unicode.c
> index 7d75272..aafaf0d 100644
> --- a/fs/cifs/cifs_unicode.c
> +++ b/fs/cifs/cifs_unicode.c
> @@ -26,6 +26,127 @@
> #include "cifs_debug.h"
>
> /*
> + * cifs_mapchar - convert a little-endian char to proper char in codepage
> + * @target - where converted character should be copied
> + * @src_char - 2 byte little-endian source character
> + * @cp - codepage to which character should be converted
> + * @mapchar - should character be mapped according to mapchars mount option?
> + *
> + * This function handles the conversion of a single character. It is the
> + * responsibility of the caller to ensure that the target buffer is large
> + * enough to hold the result of the conversion (at least NLS_MAX_CHARSET_SIZE).
> + */
> +static int
> +cifs_mapchar(char *target, const __le16 src_char, const struct nls_table *cp,
> + bool mapchar)
> +{
> + int len = 1;
> +
> + if (!mapchar)
> + goto cp_convert;
> +
> + /*
> + * BB: Cannot handle remapping UNI_SLASH until all the calls to
> + * build_path_from_dentry are modified, as they use slash as
> + * separator.
> + */
> + switch (le16_to_cpu(src_char)) {
> + case UNI_COLON:
> + *target = ':';
> + break;
> + case UNI_ASTERIK:
> + *target = '*';
> + break;
> + case UNI_QUESTION:
> + *target = '?';
> + break;
> + case UNI_PIPE:
> + *target = '|';
> + break;
> + case UNI_GRTRTHAN:
> + *target = '>';
> + break;
> + case UNI_LESSTHAN:
> + *target = '<';
> + break;
> + default:
> + goto cp_convert;
> + }
> +
> +out:
> + return len;
> +
> +cp_convert:
> + len = cp->uni2char(le16_to_cpu(src_char), target,
> + NLS_MAX_CHARSET_SIZE);
> + if (len <= 0) {
> + *target = '?';
> + len = 1;
> + }
> + goto out;
> +}
> +
> +/*
> + * cifs_utf16le_to_host - convert utf16le string to local charset
> + * @to - destination buffer
> + * @from - source buffer
> + * @tolen - destination buffer size (in bytes)
> + * @fromlen - source buffer size (in bytes)
> + * @codepage - codepage to which characters should be converted
> + * @mapchar - should characters be remapped according to the mapchars option?
> + *
> + * Convert a little-endian utf16le string (as sent by the server) to a string
> + * in the provided codepage. The tolen and fromlen parameters are to ensure
> + * that the code doesn't walk off of the end of the buffer (which is always
> + * a danger if the alignment of the source buffer is off). The destination
> + * string is always properly null terminated and fits in the destination
> + * buffer. Returns the length of the destination string in bytes (including
> + * null terminator).
> + */
> +int
> +cifs_utf16le_to_host(char *to, const __le16 *from, int tolen, int fromlen,
> + const struct nls_table *codepage, bool mapchar)
> +{
> + int i, charlen, safelen;
> + int outlen = 0;
> + int nullsize = null_charlen(codepage);
> + int fromwords = fromlen / 2;
I think assumption here is code values are two bytes. I think that is
correct in case of UCS-2 encoding
but in case of UTF-16, the code values can be either two or four bytes.
> + char tmp[NLS_MAX_CHARSET_SIZE];
> +
> + /*
> + * because the chars can be of varying widths, we need to take care
> + * not to overflow the destination buffer when we get close to the
> + * end of it. Until we get to this offset, we don't need to check
> + * for overflow however.
> + */
> + safelen = tolen - (NLS_MAX_CHARSET_SIZE + nullsize);
Can safelen become negative? In case of a code value byte stream
consisting of say two, two byte code values?
> +
> + for (i = 0; i < fromwords && from[i]; i++) {
> + /*
> + * check to see if converting this character might make the
> + * conversion bleed into the null terminator
> + */
> + if (outlen >= safelen) {
> + charlen = cifs_mapchar(tmp, from[i], codepage, mapchar);
If mapchar is not set, cifs_mapchar is always going to return 1 (since
uni2char always returns 1)
in case of no error.
> + if (charlen <= 0)
> + charlen = 1;
> + if ((outlen + charlen) > (tolen - nullsize))
> + break;
> + }
> +
> + /* put converted char into 'to' buffer */
> + charlen = cifs_mapchar(&to[outlen], from[i], codepage, mapchar);
> + outlen += charlen;
> + }
> +
> + /* properly null-terminate string */
> + for (i = 0; i < nullsize; i++)
> + to[outlen++] = 0;
> +
> + return outlen;
> +}
> +
> +/*
> * NAME: cifs_strfromUCS()
> *
> * FUNCTION: Convert little-endian unicode string to character string
> diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
> index 2dfae68..e23ef08 100644
> --- a/fs/cifs/cifs_unicode.h
> +++ b/fs/cifs/cifs_unicode.h
> @@ -72,6 +72,8 @@ extern struct UniCaseRange UniLowerRange[];
> #endif /* UNIUPR_NOLOWER */
>
> #ifdef __KERNEL__
> +int cifs_utf16le_to_host(char *to, const __le16 *from, int tolen, int fromlen,
> + const struct nls_table *codepage, bool mapchar);
> int cifs_strfromUCS_le(char *, const __le16 *, int, const struct nls_table *);
> int cifs_strtoUCS(__le16 *, const char *, int, const struct nls_table *);
> #endif
> --
> 1.6.0.6
>
> _______________________________________________
> linux-cifs-client mailing list
> linux-cifs-client at lists.samba.org
> https://lists.samba.org/mailman/listinfo/linux-cifs-client
>
More information about the linux-cifs-client
mailing list