using rsync with Mac OS X
kurash at sassafras.com
Mon Dec 17 14:26:04 EST 2001
As David Feldman wrote recently, rsync looks like it would be very
useful for Mac OS X systems, where there is currently a dearth of
options for backup.
I am looking into using rsync to backup/mirror a few systems, but
there are two changes that I will need to make first, based on two
file system features:
- Mac OS X systems use HFS+, which supports files with one or two forks.
- HFS+ also supports some "meta data" for all files and directories.
There are a few ways to add support for these FS features:
1) convert (on the fly) all files to MacBinary before
comparing/sending them to the destination. MacBinary is a well
documented way to package an HFS file into a single data file. The
benefits with this method are compatibility with existing rsync
versions that are not MacBinary aware, while the drawbacks are speed,
maintainability, and that directory metadata is not addressed at all.
2) Treat the two forks and metadata as three separate files for the
purposes of comparison/sending, and then reassemble them on the
destination. Same drawbacks and benefits of the MacBinary route.
This would also take more memory (potentially three times the number
of files in the flist).
3) Change the protocol and implementation to handle arbitrary
metadata and multiple forks. This could be made sort-of compatible
with existing rsync's by using various tricks, but the most efficient
way would be to alter the protocol. Benefits are that this would
make the protocol extensible. Metadata can be "tagged" so that you
could add any values needed, and ignore those tags that are not
understood or supported. Any number of forks could be supported,
which gives a step up in supporting NTFS where a file can have any
number of "data streams". In fact, forks and metadata could all be
done in the same way in the protocol.
So, my question is, has anyone else done work in the areas of
protocol enhancements and "rich" FS support?
I have lots of experience on the Mac and have the code needed to
access HFS+ metadata and forks from the BSD layer. I'm just looking
for suggestions and news of anyone else working on stuff that might
dovetail with this.
Also, I'm a bit concerned about the current behavior of reading the
entire tree into memory, especially the effects that would have on
large file sets. Any work being done on this front?
More information about the rsync