using rsync with Mac OS X

Mark Valence kurash at
Mon Dec 17 14:26:04 EST 2001

As David Feldman wrote recently, rsync looks like it would be very 
useful for Mac OS X systems, where there is currently a dearth of 
options for backup.

I am looking into using rsync to backup/mirror a few systems, but 
there are two changes that I will need to make first, based on two 
file system features:

  - Mac OS X systems use HFS+, which supports files with one or two forks.

  - HFS+ also supports some "meta data" for all files and directories.

There are a few ways to add support for these FS features:

1) convert (on the fly) all files to MacBinary before 
comparing/sending them to the destination.  MacBinary is a well 
documented way to package an HFS file into a single data file.  The 
benefits with this method are compatibility with existing rsync 
versions that are not MacBinary aware, while the drawbacks are speed, 
maintainability, and that directory metadata is not addressed at all.

2) Treat the two forks and metadata as three separate files for the 
purposes of comparison/sending, and then reassemble them on the 
destination.  Same drawbacks and benefits of the MacBinary route. 
This would also take more memory (potentially three times the number 
of files in the flist).

3) Change the protocol and implementation to handle arbitrary 
metadata and multiple forks.  This could be made sort-of compatible 
with existing rsync's by using various tricks, but the most efficient 
way would be to alter the protocol.  Benefits are that this would 
make the protocol extensible.  Metadata can be "tagged" so that you 
could add any values needed, and ignore those tags that are not 
understood or supported.  Any number of forks could be supported, 
which gives a step up in supporting NTFS where a file can have any 
number of "data streams".  In fact, forks and metadata could all be 
done in the same way in the protocol.

So, my question is, has anyone else done work in the areas of 
protocol enhancements and "rich" FS support?

I have lots of experience on the Mac and have the code needed to 
access HFS+ metadata and forks from the BSD layer.  I'm just looking 
for suggestions and news of anyone else working on stuff that might 
dovetail with this.

Also, I'm a bit concerned about the current behavior of reading the 
entire tree into memory, especially the effects that would have on 
large file sets.  Any work being done on this front?



