Filename character translation
savin.gorup at asist-traffic.com
Sat Dec 7 10:55:59 EST 2002
I came across the problem with rsync-2.5.5 on Cygwin/Win2K while rsyncing
with filenames which have 'strange' (non latin-1) characters in filenames.
The problem is that filenames on Windows system are coded (in our case) in
codepage 852, while server (Linux system) has filename coding according to
ISO-8859-2. This two are not fully compatible, causing rsync to simply skip
copying some files (and whole directories!) to server.
Samba solves this kind of problem by using 'client code page' and 'character
set' options. I propose somewhat simpler solution using translation table
between local and remote file system.
I have developed a patch to address the problem, which basically does this:
- adds command line option --filename-translation (options.c)
- builds two way character translation lookup table in memory (512 bytes)
- translates filenames at appropriate places (sender.c, flist.c)
is --filename-translation is present
Note this patch can't handle multibyte encodings. The performance impact of
translation should be negliable, especially if not active. The patch changes
multiple files and is rather long so I'd like to open a discussion before
There has been some interest in that topic before here
(http://firstname.lastname@example.org/msg03306.html) and also
on some other, local mailing lists. Since inability to copy all files
renders rsync unusable to non-latin-1 users I would like to hear some
comments about including the patch into main source tree (or proposing a
better solution, of course).
More information about the rsync