[Samba] Conversion error: Illegal multibyte sequence

Jeremy Allison jra at samba.org
Thu Sep 5 14:35:28 MDT 2013


On Thu, Sep 05, 2013 at 10:15:16PM +0200, Laurent Blume wrote:
> Hello list,
> 
> I've noticed this problem for a few years now, I think. I see it popped
> out now and then in discussions. But they always end before a solution
> is given.
> 
> So let's try one more time :-)
> 
> I have plenty of UTF-8 named files and directories. It's UTF-8 all
> round, I don't use anything else, so I have no doubt the byte sequences
> are correct in the filesystem (I happen to have accented Latin chars,
> Chinese, Japanese, and some non-letters chars, so it'd show up really
> quick if there was an issue there :-)
> 
> I see those kind of errors in the logs, here about a directory named
> "♻_Corbeille".
> Please note: those lines are a direct copy of the log. So yes, the "♻"
> character is indeed correct in the two first entries (including the
> first conversion error line), but improperly logged as an invalid byte
> sequence in the two latter entries, and those two have different lengths.
> 
> [2013/09/05 20:43:50.280597,  3] smbd/dir.c:1046(smbd_dirptr_get_entry)
>   smbd_dirptr_get_entry mask=[*] found ./♻_Corbeille fname=♻_Corbeille
> (♻_Corbeille)
> [2013/09/05 20:43:50.280641,  3] lib/charcnv.c:161(convert_string_internal)
>   convert_string_internal: Conversion error: Illegal multibyte
> sequence(♻_Corbeille)
> [2013/09/05 20:43:50.280679,  3] lib/charcnv.c:140(convert_string_internal)
>   convert_string_internal: Conversion error: Incomplete multibyte
> sequence(��_Corbeille)
> [2013/09/05 20:43:50.280715,  3] lib/charcnv.c:140(convert_string_internal)
>   convert_string_internal: Conversion error: Incomplete multibyte
> sequence(�_Corbeille)
> 
> 
> It does not prevent using the directory, and it displays properly on
> Windows clients. So the issue is merely an annoying flood of logs.
> 
> The system is Solaris 10, running Samba 3.6.18 linked against GNU
> libiconv 1.14.
> 
> The charsets are defined like this in the configuration:
> 
>   dos charset          = cp850
>   unix charset         = UTF8
>   display charset      = UTF8
> 
> 
> So, any definitive fix for that?

This is the call to smb_iconv() returning an errno of EINVAL.

Firstly, add some debug statements inside smb_iconv_open_ex()
to find out if we're using the sys_iconv() function (that
calls the system iconv) or the internal UFT8 converters.

If it's the system iconv then you'll have to look inside
that source code.

If it's the internal converters add some debug statements
inside utf8_pull() and utf8_push() to see where the EINVAL
is being returned.

This will help track it down for your individual case.

Jeremy.


More information about the samba mailing list