superlifter design notes (was Re: ...
John E. Malmberg
wb8tyw at qsl.net
Sat Jul 27 20:37:01 EST 2002
Martin Pool wrote:
> On 27 Jul 2002, "John E. Malmberg" <wb8tyw at qsl.net> wrote:
>
>>A program serving source files for distribution does not need to be that
>>concerned with preserving exact file attributes, but may need to track
>>suggested file attributes for for the various client platforms.
>>
>>A program that is replicating for backup purposes must not have any
>>loss of data, including any operating specific file attributes.
>>
>>That is why I posted previously that they should be designed as two
>>separate but related programs.
>
> I'm not sure that the application space for rsync really divides
> neatly into two parts like that. Can you expand a bit more on how
> you think they would be used?
Well remember, I am on the outside looking in, and of course I could be
missing things. :-)
I did post this previously, but the message apparently got buried the
large number of messages posted that day.
The two uses for rsync that I am seeing discussed on this list are:
Backup: A low overhead and possibly distance backup of disks or directory.
In the case of a backup, usually it is the same platform, or one that is
very close to being the same. Also it is important that security
information, and file attributes all be properly maintained.
The mapping of security information is platform specific, so this is a
going to be an ongoing problem. It is also critical that timestamps be
maintained.
Since this is usually the same or closely similar platforms, a VFS layer
can be used to store and retrieve attributes. No special attribute
files or host based translations should be needed.
The downsides are that as far as I can see there are no portable
standard APIs to retrieve the security information, and as more variants
are discovered, it may be hard to work them in for backward compatability.
Because you are distributing an arbitrary set of directories, it is
ususally not permitted to add files to assist in the transfer.
This also seems to be an addition to rsync's original mission.
Also using something like rsync for backup of binary files has the
potential for undetected corruption. While the checksumming algorithm
is good, it is not guaranteed to be perfect. And no, I do not want to
recycle the old arguments about this.
With a text file, the set of possible values is restricted enough that
it is unlikely that the checksum method would fail, and if it did, the
resulting corruption is more easily detected.
File Distribution: A low overhead method of keeping local source
directory trees synchronized with remote distributions.
In this case, strict binary preservation of time stamps is not needed
and maintaining security attributes is usually not desired. So that is
two problems eliminated.
What rsync does not do now, is differentiate between text files and
binary files. A client that uses a different internal format for text
files than binary files needs to do extra work.
And unless the server tells it what type of file is coming, it must
guess based on the filename.
But you are specifically distributing a special tree of files in this
case, not an arbitrary directory. That gives you the ability to add
special attribute files to assist in the transfer.
So while the two uses have a lot in common, there are significant
differences, and having one program attempt to do both can lead to
greater complexity.
-John
wb8tyw at qsl.network
Personal Opinion Only
More information about the rsync
mailing list