DO NOT REPLY [Bug 2790] Add support for converting filenames into different encodings

samba-bugs at samba.org samba-bugs at samba.org
Tue Oct 30 16:09:14 GMT 2007


https://bugzilla.samba.org/show_bug.cgi?id=2790


cabo at tzi.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cabo at tzi.org




------- Comment #9 from cabo at tzi.org  2007-10-30 11:09 CST -------
The current solution appears to be somewhat confused about what it is trying to
solve.

There are three filename encodings: the one in the client fs, the transfer
encoding, the one in the server fs.
Client needs to know client-fs and transfer, server needs to knoe server-fs and
transfer.
Trying to mush up any two of the three leads to pain.

There are also three scenarios:

-- sane: common transfer encoding (UTF-8 in NFC).  Server and client need to
know local conventions; as in current --iconv=., they probably can figure that
out.

-- compatible: The server may not know about iconv.  So the client has to do
all the conversions.  This is almost support now, except that the client sends
an iconv option to the server that this does not understand.

-- fast: if both sides have the same encoding, the whole thing should be
skipped.  This is also compatible (it is the way it works right now).

Because of compatibility, "sane" probably needs an option to switch it on. It
may also need client-side and server-side overrides to help these two out if
they can't guess or guess wrong.
Compatible also needs an option to switch it on, and parameters to control the
conversion.  It is by definition client-side only; the client needs to be told
what the server needs (and also may need help in guessing its own encoding). 
(For symmetry, it is also conceivable to add a server-side compatible option as
part of the ssh-options.)
Fast is the current (2.x) default and probably should stay the default for
compatibility.

So I propose (names are descriptive, but not optimal yet):

--encoding-aware: Switches on sane.
--client-encoding: supplies (overrides) value for client-side encoding for
sane.
--server-encoding: supplies (overrides) value for server-side encoding for
sane.
--transfer-encoding: overrides the transfer-encoding (default: UTF-8 NFC).
--server-encoding-unaware: don't tell the server anything, but do everything on
client-side.
--client-encoding-unaware: inverse (if you want to do that).

Maybe combining --encoding-aware and --server-encoding-unaware into one
--client-encoding-aware is better.
Maybe combining --encoding-aware and --client-encoding-unaware into one
--server-encoding-aware is better.
In both cases, this is somewhat confusing, because you want to keep the sane
transfer coding unless you are in the compatible case.

The only switch that needs a single-character form is --encoding-aware, which
should get part of finger memory like -a for most rsync users.


-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.


More information about the rsync mailing list