state of the rsync nation? (revisited 6/2003 from 11/2000)
jw at pegasys.ws
Wed Jun 11 11:27:20 EST 2003
On Tue, Jun 10, 2003 at 06:13:48PM +1000, Brad Hards wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> On Tue, 10 Jun 2003 14:21 pm, Martin Pool wrote:
> > I guess the reason why you're interested in doing it is so that you
> > can browse public rsync mirrors from Konqueror/whatever?
> Yep. Also, I was playing with the idea of rsync with Service Location Protocol
> to use as a replacement for the crappy practice of sharing data over floppy
> disks. The rough concept was that each machine had a shared directory, which
> you could conveiently label and advertise over SLP.
> > Speaking only for myself, I don't think this is worth spending time
> > on. It would be hard to write a wire-compatible library, and hard to
> > refactor rsync into such a library.
> I considered it, but not very long :)
> > Not only might a new tool be written more easily without baggage, it
> > might also (in a couple of years) persuade people running mirror sites
> > to switch. I know many of them are unhappy with rsync at the moment:
> > - large memory usage
> > - no really good ways to restrict client usage
> > - ...
> Go superlifter! For what it is worth, the things I identified during the
> abortive kioslave / SLPv2 share development:
> 1. More secure than FTP.
> 2. Easy to label shares/directories and provide fine grained access control,
> if desired.
> 3. Client side library that doesn't require hellish text parsing, or at least
> hides it from you.
> 4. Well delimited packets, so you can tell when one has been dropped.
hmm, i can hear the gears
meshing and turning between my ears :)
Current rsync just doesn't fit well as a lightweight network
service. The current protocol is too tightly coupled to
monolithic commands to support in a browsing and
interactivity in a reasonable manner.
What would really be useful for that would be quite
different that the current rsync. And i think, different
that the direction we have discussed regarding rsync
replacements. However, i think it may be worth exploring.
Let me put out a brief description and if you all want to
discuss it further we can start a new thread.
Service utility would be much smaller than the client
utility. The service utility could be started in an ssh
shell initiated by the client. The daemon would be a
completely different executable because it would add
configfiles, authentication and encryption. This means that
the client, service and daemon would be separate executables.
I'm only going to describe the core functionality of the
service util and daemon from the protocol perspective.
These are the commands the service would support defined in
server relative terms. Most of these would operate on a CWD
relative filename or perhaps on a numeric file_id (probably
also CWD dependant).
set assorted options (numeric IDs etc.)
send the stat(2) info for a single file or directory
This could optionally include a file checksum.
list files within specified directory. Optionally
with stat info for each. This is not recursive.
send file checksum.
send block checsums for file.
delete a file or directory. This is recursive.
create a new file with given name containing
attached data and set the perms to those attached.
set ownership and permissions of file in accordance
with settings and process permissions.
return the user/group IDs corresponding to the list
of names, or user/group names corresponding to the
list of IDs.
send the contents from a file for each offset and length
specified in this command.
update a file. This would be accompanied by a
sequence of offset+length of existing file contents
and new file contents to be merged to form new file.
No doubt there are a few more commands that would be needed
as well as protocol and capability negotiations.
A major factor here is that the server would be extremely
lightweight. The load of generating the change list and
comparing block checksums with rolling to calculate the
update sequence would all be born by the client. Further,
to the degree possible it would favour a client on the
downstream side of an asymmetric network connection.
The protocol while not stateless would be a set of atomic
commands so if the client lost its server connection it
could reestablish the session cheaply and pick up where it
For performance the communications would have to be
pipelined and run full-duplex to keep data flowing as fast
as possible. This only impacts the client. Little issuing
of a command and waiting for results. Mostly issue commands
as fast the client can generate them and process the results
as they come in.
J.W. Schultz Pegasystems Technologies
email address: jw at pegasys.ws
Remember Cernan and Schmitt
More information about the rsync