[Bug 2790] Add support for converting filenames into different encodings

samba-bugs at samba.org samba-bugs at samba.org
Sun Jun 12 06:40:22 GMT 2005


wayned at samba.org changed:

           What    |Removed                     |Added
           Severity|major                       |enhancement
             Status|NEW                         |ASSIGNED
            Summary|mkstemp fails for paths     |Add support for converting
                   |which include extended (8-  |filenames into different
                   |bit) characters             |encodings

------- Additional Comments From wayned at samba.org  2005-06-11 23:40 -------
Rsync does not change the filenames it transfers in any way, so if the OS
refuses to store a certain sequence of characters, that is currently out of
rsync's hands. (The backslashes you see is just rsync's way of outputting
high-bit characters is a visible manner.)  It would be nice if rsync supported
some kind of filename transformation support so that conversion to and from
UTF-8 (or whatever) would be possible.

OS X is known to reject certain multi-byte high-bit characters that aren't
compatible with its own high-bit character encoding.  Your current choices are
to (1) change the character encoding on the source FS to match the encoding of
the destination FS, making the names compatible; (2) not use high-bit character
sequences that conflict between OSes; (3) pre-process the files to convert
high-bit characters into sequences that won't fail; (4) use the fname-conv.diff
patch in the patches dir to enhance rsync with some basic name-conversion
support; (5) help to create a better filename-conversion solution.

I didn't really like the solution in the fname-conv.diff because it typically
results in a huge number of forked command calls, one for each filename
processed.  It is a very versatile solution, but is probably overkill for what
rsync really needs: the optional(!) ability to use iconv() on the filenames it
sends (transferring names in UTF-8 and converting the names via library calls to
the local encoding needed).  If someone would like to work on a solution for
this, please let me know.

