HFS+ resource forks: WIP patch included

D Andrew Reynhout reynhout at quesera.com
Wed Mar 10 18:11:35 GMT 2004


As you all know, rsync doesn't have any special handling
for Mac OS X HFS+ resource forks.  Kevin Boyd made RsyncX
and rsync_hfs, to address this gap, but they only work when
the destination filesystem is also HFS+.  I haven't been 
able to find any references to an rsync that is capable of
syncing from HFS+ to UFS (etc).  The only solutions I've
seen involve lots of preprocessing chicanery and additional
disk space consumption (basically, dupe the resource forks
or entire files first, then copy them over), which doesn't
suit my needs.

So I'm working on a patch to the mainline of rsync-2.6.0
that will recognize a resource fork on an HFS+ filesystem,
and copy the data to a separate file on the destination FS.
This will allow me to back up Mac desktops to larger hard
drives on Solaris and Linux file servers.

Many people seem to have a need for this, but I don't know
if it's a feature that belongs in the standard distribution.
Definitely not in the overly-specific form I've approached
the problem, but I could imagine a more general solution to
this and similar situations (NTFS streams, other FS metadata)
being a worthwhile project.

Anyway, I'm not sure how many people on this list run HFS+.
I'm guessing not many, or someone would have already written
a (better) version of this patch.  But I'm sending it in 
hopes of getting some feedback.  My C skills have atrophied
over the past ten years or so, and I only spent a few hours
looking over the flow of the rsync code, so I may have made
some very bad decisions.

That said, it does work, for me.  I've synced ~200GB of HFS+
filesystems, several times, from a few different machines,
and it sure seems like everything's OK.

If you have an interest and can test the code, please do so.
If you can offer rsync architectural advice, please do so.
If you can offer C syntax or style critique, please do so.

It looks like I'm going to be using this pretty heavily, and
at some point depending on it, so I have significant interest
in making it work as well as it can.

Thanks for your help,
Andrew
reynhout at quesera.com


OVERVIEW OF PATCH
=================

If --hfs-mode switch is specified, check every regular file
on the local filesystem (no FStype checks) for the presence
of a resource fork.  If found, add it as an additional file
to the flist, but change the name it will be synced to on
the destination FS.

On HFS+, a resource fork for <filename> is accessed via
<filename>/..namedfork/rsrc .  Obviously, we can't send the
same filename to the destination, because an inode can't be
a regular file and a directory simultaneously.

Map:  <filename>/..namedfork/rsrc
to:   <filename>.~~~namedfork.rsrc

It appears to be critical that the local and destination
filenames sort into identical positions in the flist.  I
see the utility of sorting filenames -- to guarantee that
directories are treated before the files they contain, to
make removing duplicates easier -- but I'm less clear on
why (it appears) that *both* sides sort the flist.

Since the file-id is implicit (the offset in the flist),
the original and new filenames MUST sort into the same
position in the flist, or filesystem corruption will
result on the destination FS (files will have the wrong
filenames).  I think this could be fixed with an EXPLICIT
file-id, but of course rsync was designed with traffic
minimization in mind, so the protocol exchanges as little
data as possible.

Anyway, I picked .~~~namedfork.rsrc because it sorts into
the same position as /..namedfork/rsrc in almost all cases.
(Extended ASCII chars in filenames (and EBCDIC systems!)
complicate this assertion, but I haven't decided exactly
what to do about it yet.)

So then the only trick is to get send_file_entry to use
the DESTINATION filename isntead of the local filename
when sending the flist to the other side.  This is done
by replacing the call to f_name in send_file_entry with
f_name_dst, which duplicates f_name except for the above
change.  It's ugly, but it works for now.

NOTE: this is currently a ONE-WAY, SEND-ONLY operation.
Files will have to be manually reassembled if restoration
is required.  It would be easy to add automatic reassembly,
but it's not in here yet.  Also, the current code DOES NOT
require that the destination rsync also be patched.  It
works just fine with stock rsync-2.6.0.  This is a valuable
feature for me right now, so I'm resisting altering it.

Current (WIP!) version of patch is included below.
It's also available and will be kept updated at:
http://www.quesera.com/~reynhout/misc/rsync-hfs-mode-patch

The switch to turn the new stuff on is: --hfs-mode=darsplit
I wanted to include mode "appledot" to set the resource fork
filename to ._<filename> on the other side, but the sorting
issue described above got in the way...for now, at least.

Thanks again for any help you can offer.  Don't be gentle.


diff -u rsync-2.6.0/flist.c rsync-2.6.0-dar/flist.c
--- rsync-2.6.0/flist.c	Mon Dec 15 03:10:31 2003
+++ rsync-2.6.0-dar/flist.c	Wed Mar 10 11:29:32 2004
@@ -382,7 +382,7 @@
 
 	io_write_phase = "send_file_entry";
 
-	fname = f_name(file);
+	fname = f_name_dst(file);
 
 	flags = base_flags;
 
@@ -737,7 +737,7 @@
 		if (lastdir && strcmp(fname, lastdir) == 0) {
 			file->dirname = lastdir;
 		} else {
-			file->dirname = strdup(fname);
+			file->dirname = STRDUP(ap,fname);
 			lastdir = file->dirname;
 		}
 		file->basename = STRDUP(ap, p + 1);
@@ -804,6 +804,9 @@
 	struct file_struct *file;
 	extern int delete_excluded;
 
+	struct file_struct *file_hfsrf;
+	extern char *hfs_mode;
+
 	/* f is set to -1 when calculating deletion file list */
 	file = make_file(fname, &flist->string_area,
 			 f == -1 && delete_excluded? SERVER_EXCLUDES
@@ -831,8 +834,77 @@
 		local_exclude_list = last_exclude_list;
 		return;
 	}
+
+	if ( file->basename[0] && S_ISREG(file->mode) && hfs_mode ) {
+		/** check for rsrc fork, and add if found */
+		file_hfsrf = hfs_rsrc_fork(f,flist,file,fname);
+		if ( file_hfsrf ) {
+			flist->files[flist->count++] = file_hfsrf;
+			send_file_entry(file_hfsrf, f, base_flags);
+		}
+	}
 }
 
+struct file_struct *hfs_rsrc_fork(int f, struct file_list *flist,
+	struct file_struct *file, char *fname)
+{
+	extern char *hfs_mode;
+	extern int delete_excluded;
+	struct file_struct *file_rf;
+	char *fname_rf;
+	struct stat statbuf;
+	mode_t modes;
+	off_t size;
+	char *suffix=".~~~namedfork.rsrc";
+
+	/**	HFS+ resource fork handler code:
+
+		HFS+ exposes resource fork as <filename>/..namedfork/rsrc
+
+		If >0bytes, make new file_struct. Populate dirname_dst
+		with regular file's path, and basename_dst with arbitrary
+		destination filename: <filename>.~~~namedfork.rsrc
+
+		Detailed rationale for this weird decision presented at
+		http://www.quesera.com/~reynhout/misc/rsync-hfs-mode-patch
+
+		Then add the file_struct to flist, and sort out the
+		"which-name-to-use" problem in send_file_entry() by
+		calling f_name_dst() instead of f_name(), which is ugly.
+
+		Andrew Reynhout <reynhout at quesera.com>
+	**/
+
+	if ( hfs_mode && (!S_ISDIR(file->mode)) ) {
+		fname_rf=strcat(fname,"/..namedfork/rsrc");
+		stat(fname_rf,&statbuf);
+		modes=statbuf.st_mode;
+		size=statbuf.st_size;
+
+		if ( (S_ISREG(modes)) && (size > 0) ) {
+			file_rf = make_file(fname_rf, &flist->string_area,
+				f == -1 && delete_excluded? SERVER_EXCLUDES
+				: ALL_EXCLUDES);
+
+			file_rf->dirname_dst = file->dirname;
+
+			if ( strcmp(hfs_mode,"darsplit") == 0 ) {
+				file_rf->basename_dst = new_array(char,
+					MAXPATHLEN); 
+				if (!file_rf->basename_dst)
+					out_of_memory("hfs_rsrc_fork 2a");
+				sprintf(file_rf->basename_dst, "%s%s",
+					file->basename,suffix);
+			}
+
+			/* rprintf(FERROR,"darDEBUG: %s,%s to %s,%s\n",
+				file_rf->dirname,file_rf->basename,
+				file_rf->dirname_dst,file_rf->basename_dst); */
+			return file_rf;
+		}
+	}
+	return NULL;
+}
 
 
 static void send_directory(int f, struct file_list *flist, char *dir)
@@ -1409,3 +1481,35 @@
 
 	return p;
 }
+
+char *f_name_dst(struct file_struct *f)
+{
+	static char names[10][MAXPATHLEN];
+	static int n;
+	char *p = names[n];
+	char *dname, *bname;
+
+	if (!f || !f->basename)
+		return NULL;
+
+	dname=f->dirname;
+	bname=f->basename;
+	if ( f->basename_dst ) {
+		dname=f->dirname_dst;
+		bname=f->basename_dst;
+	}
+
+	n = (n + 1) % 10;
+
+	if (dname) {
+		int off;
+
+		off = strlcpy(p, dname, MAXPATHLEN);
+		off += strlcpy(p + off, "/", MAXPATHLEN - off);
+		off += strlcpy(p + off, bname, MAXPATHLEN - off);
+	} else {
+		strlcpy(p, bname, MAXPATHLEN);
+	}
+
+	return p;
+} 
diff -u rsync-2.6.0/loadparm.c rsync-2.6.0-dar/loadparm.c
--- rsync-2.6.0/loadparm.c	Sat Dec  6 16:07:27 2003
+++ rsync-2.6.0-dar/loadparm.c	Sat Mar  6 16:35:39 2004
@@ -135,6 +135,7 @@
 	char *include;
 	char *include_from;
 	char *log_format;
+	char *hfs_mode;
 	char *refuse_options;
 	char *dont_compress;
 	int timeout;
@@ -293,6 +294,7 @@
   {"transfer logging", P_BOOL,    P_LOCAL,  &sDefault.transfer_logging,NULL,0},
   {"ignore errors",    P_BOOL,    P_LOCAL,  &sDefault.ignore_errors,NULL,0},
   {"log format",       P_STRING,  P_LOCAL,  &sDefault.log_format,  NULL,   0},
+  {"HFS mode",         P_STRING,  P_LOCAL,  &sDefault.hfs_mode,    NULL,   0},
   {"refuse options",   P_STRING,  P_LOCAL,  &sDefault.refuse_options,NULL, 0},
   {"dont compress",    P_STRING,  P_LOCAL,  &sDefault.dont_compress,NULL,  0},
   {NULL,               P_BOOL,    P_NONE,   NULL,                  NULL,   0}
@@ -370,6 +372,7 @@
 FN_LOCAL_STRING(lp_include, include)
 FN_LOCAL_STRING(lp_include_from, include_from)
 FN_LOCAL_STRING(lp_log_format, log_format)
+FN_LOCAL_STRING(lp_hfs_mode, hfs_mode)
 FN_LOCAL_STRING(lp_refuse_options, refuse_options)
 FN_LOCAL_STRING(lp_dont_compress, dont_compress)
 FN_LOCAL_INTEGER(lp_timeout, timeout)
diff -u rsync-2.6.0/options.c rsync-2.6.0-dar/options.c
--- rsync-2.6.0/options.c	Tue Dec 30 13:16:25 2003
+++ rsync-2.6.0-dar/options.c	Mon Mar  8 16:51:57 2004
@@ -113,6 +113,7 @@
 char *config_file = NULL;
 char *shell_cmd = NULL;
 char *log_format = NULL;
+char *hfs_mode = NULL;
 char *password_file = NULL;
 char *rsync_path = RSYNC_PATH;
 char *backup_dir = NULL;
@@ -171,10 +172,11 @@
 	/* Note that this field may not have type ino_t.  It depends
 	 * on the complicated interaction between largefile feature
 	 * macros. */
-	rprintf(f, "              %sIPv6, %d-bit system inums, %d-bit internal inums\n",
+	rprintf(f, "              %sIPv6, %d-bit system inums, %d-bit internal inums,\n",
 		ipv6,
 		(int) (sizeof(dumstat->st_ino) * 8),
 		(int) (sizeof(INO64_T) * 8));
+	rprintf(f, "              HFS+ (Mac OS X) resource forks\n");
 #ifdef MAINTAINER_MODE
 	rprintf(f, "              panic action: \"%s\"\n",
 		get_panic_action());
@@ -277,6 +279,7 @@
   rprintf(F,"     --stats                 give some file transfer stats\n");
   rprintf(F,"     --progress              show progress during transfer\n");
   rprintf(F,"     --log-format=FORMAT     log file transfers using specified format\n");
+  rprintf(F,"     --hfs-mode=MODE         handle MacOS HFS+ resource forks\n");
   rprintf(F,"     --password-file=FILE    get password from FILE\n");
   rprintf(F,"     --bwlimit=KBPS          limit I/O bandwidth, KBytes per second\n");
   rprintf(F,"     --write-batch=PREFIX    write batch fileset starting with PREFIX\n");
@@ -295,7 +298,7 @@
 
 enum {OPT_VERSION = 1000, OPT_SENDER, OPT_EXCLUDE, OPT_EXCLUDE_FROM,
       OPT_DELETE_AFTER, OPT_DELETE_EXCLUDED, OPT_LINK_DEST,
-      OPT_INCLUDE, OPT_INCLUDE_FROM, OPT_MODIFY_WINDOW,
+      OPT_INCLUDE, OPT_INCLUDE_FROM, OPT_MODIFY_WINDOW, OPT_HFS_MODE,
       OPT_READ_BATCH, OPT_WRITE_BATCH};
 
 static struct poptOption long_options[] = {
@@ -366,6 +369,7 @@
   {"config",           0,  POPT_ARG_STRING, &config_file, 0, 0, 0 },
   {"port",             0,  POPT_ARG_INT,    &rsync_port, 0, 0, 0 },
   {"log-format",       0,  POPT_ARG_STRING, &log_format, 0, 0, 0 },
+  {"hfs-mode",         0,  POPT_ARG_STRING, &hfs_mode, OPT_HFS_MODE, 0, 0 },
   {"bwlimit",          0,  POPT_ARG_INT,    &bwlimit, 0, 0, 0 },
   {"address",          0,  POPT_ARG_STRING, &bind_address, 0, 0, 0 },
   {"backup-dir",       0,  POPT_ARG_STRING, &backup_dir, 0, 0, 0 },
@@ -516,6 +520,16 @@
 		case OPT_INCLUDE_FROM:
 			add_exclude_file(&exclude_list, poptGetOptArg(pc),
 					 MISSING_FATAL, ADD_INCLUDE);
+			break;
+
+		case OPT_HFS_MODE:
+			if ( (strcmp(hfs_mode,"none") != 0) &&
+				(strcmp(hfs_mode,"darsplit") != 0) ) {
+				snprintf(err_buf, sizeof err_buf,
+				"unsupported hfs-mode: \"%s\"\n",hfs_mode);
+				rprintf(FERROR, "ERROR: %s", err_buf);
+				exit_cleanup(RERR_UNSUPPORTED);
+			}
 			break;
 
 		case 'h':
diff -u rsync-2.6.0/proto.h rsync-2.6.0-dar/proto.h
--- rsync-2.6.0/proto.h	Sat Dec  6 16:07:27 2003
+++ rsync-2.6.0-dar/proto.h	Mon Mar  8 13:37:07 2004
@@ -77,8 +77,28 @@
 int link_stat(const char *path, STRUCT_STAT * buffer);
 struct file_struct *make_file(char *fname, struct string_area **ap,
 			      int exclude_level);
+void send_file_name_ifhasrsrcfork(int f, struct file_list *flist, char *fname,
+		    int recursive, unsigned base_flags);
+void send_file_name(int f, struct file_list *flist, char *fname,
+		    int recursive, unsigned base_flags);
+struct file_list *send_file_list(int f, int argc, char *argv[]);
+struct file_list *recv_file_list(int f);
+int file_compare(struct file_struct **f1, struct file_struct **f2);
+int flist_find(struct file_list *flist, struct file_struct *f);
+void free_file(struct file_struct *file);
+struct file_list *flist_new(void);
+void flist_free(struct file_list *flist);
+char *f_name(struct file_struct *f);
+char *f_name_dst(struct file_struct *f);
+void show_flist_stats(void);
+int readlink_stat(const char *path, STRUCT_STAT * buffer, char *linkbuf);
+int link_stat(const char *path, STRUCT_STAT * buffer);
+struct file_struct *make_file(char *fname, struct string_area **ap,
+			      int exclude_level);
 void send_file_name(int f, struct file_list *flist, char *fname,
 		    int recursive, unsigned base_flags);
+struct file_struct *hfs_rsrc_fork(int f, struct file_list *flist,
+	struct file_struct *file, char *fname);
 struct file_list *send_file_list(int f, int argc, char *argv[]);
 struct file_list *recv_file_list(int f);
 int file_compare(struct file_struct **f1, struct file_struct **f2);
@@ -87,6 +107,7 @@
 struct file_list *flist_new(void);
 void flist_free(struct file_list *flist);
 char *f_name(struct file_struct *f);
+char *f_name_dst(struct file_struct *f);
 void write_sum_head(int f, struct sum_struct *sum);
 void recv_generator(char *fname, struct file_list *flist, int i, int f_out);
 void generate_files(int f,struct file_list *flist,char *local_name,int f_recv);
@@ -142,6 +163,7 @@
 char *lp_include(int );
 char *lp_include_from(int );
 char *lp_log_format(int );
+char *lp_hfs_mode(int );
 char *lp_refuse_options(int );
 char *lp_dont_compress(int );
 int lp_timeout(int );
diff -u rsync-2.6.0/rsync.1 rsync-2.6.0-dar/rsync.1
--- rsync-2.6.0/rsync.1	Thu Jan  1 14:00:11 2004
+++ rsync-2.6.0-dar/rsync.1	Mon Mar  8 16:34:23 2004
@@ -385,6 +385,7 @@
      --bwlimit=KBPS          limit I/O bandwidth, KBytes per second
      --write-batch=PREFIX    write batch fileset starting with PREFIX
      --read-batch=PREFIX     read batch fileset starting with PREFIX
+     --hfs-mode=MODE         handle Mac OS X HFS+ resource forks in MODE
  -h, --help                  show this help screen
 
 
@@ -985,6 +986,34 @@
 using the fileset whose filenames start with PREFIX\&. See the "BATCH
 MODE" section for details\&.
 .IP 
+.IP "\fB--hfs-mode=MODE\fP" 
+Handle Mac OS X HFS+ files with resource forks according to MODE\&.
+This is currently a \fBONE-WAY\fP, \fBSEND-ONLY\fP operation, and
+is only useful when backing up a Mac OS X machine to a non-HFS+
+filesystem (e\&.g\&. a Linux fileserver)\&.  The rsync process must
+be initiated from the Mac\&.  It is \fINOT\fP necessary for the
+remote rsync to also have this patch in place\&.
+.IP
+MODE "darsplit" will save the resource fork of <filename> to
+.nf
+<filename>\&.~~~namedfork\&.rsrc on the destination filesystem\&.
+.fi
+.IP
+No other MODEs are valid at this time\&.  See web page for explanation\&.
+.RS
+http://www\&.quesera\&.com/~reynhout/misc/rsync-hfs-mode-patch
+.RE
+.IP
+I use this command line to back up my home directory to fileserver:
+.IP
+.nf
+rsync --archive --delete --verbose --hfs-mode=darsplit \\
+                /Users/reynhout fileserver:/backups/mac
+.fi
+.IP 
+(hfs-mode is a patch to rsync, not part of the standard distribution.
+For more info and the latest version, see the web page listed above.)
+.IP
 .PP 
 .SH "EXCLUDE PATTERNS" 
 .PP 
diff -u rsync-2.6.0/rsync.h rsync-2.6.0-dar/rsync.h
--- rsync-2.6.0/rsync.h	Tue Dec 16 18:04:59 2003
+++ rsync-2.6.0-dar/rsync.h	Sun Mar  7 19:04:50 2004
@@ -381,6 +381,11 @@
 	char *basedir;
 	char *link;
 	char *sum;
+
+	/** these are used by the hfs-mode switch to copy HFS+
+	*** resource forks to a different filename on the dest */
+	char *dirname_dst;
+	char *basename_dst;
 };
 
 
diff -u rsync-2.6.0/rsync.yo rsync-2.6.0-dar/rsync.yo
--- rsync-2.6.0/rsync.yo	Thu Jan  1 14:00:11 2004
+++ rsync-2.6.0-dar/rsync.yo	Mon Mar  8 16:34:25 2004
@@ -348,6 +348,7 @@
      --bwlimit=KBPS          limit I/O bandwidth, KBytes per second
      --write-batch=PREFIX    write batch fileset starting with PREFIX
      --read-batch=PREFIX     read batch fileset starting with PREFIX
+     --hfs-mode=MODE         handle Mac OS X HFS+ resource forks in MODE
  -h, --help                  show this help screen
 
 
@@ -854,6 +855,26 @@
 dit(bf(--read-batch=PREFIX)) Apply a previously generated change batch,
 using the fileset whose filenames start with PREFIX. See the "BATCH
 MODE" section for details.
+
+dit(bf(--hfs-mode=MODE)) Handle Mac OS X HFS+ files with resource
+forks according to MODE.  This is currently a ONE-WAY, SEND-ONLY
+operation, and is only useful when backing up a Mac OS X machine
+to a non-HFS+ filesystem (e.g. a Linux fileserver).  It is NOT
+necessary for the remote rsync to also have this patch in place.
+
+MODE "darsplit" will save the resource fork of <filename> to
+<filename>.~~~namedfork.rsrc on the destination filesystem.
+
+No other MODEs are valid at this time.  See web page for explanation.
+http://www.quesera.com/~reynhout/misc/rsync-hfs-mode-patch
+
+I use this command line to back up my home directory to fileserver:
+
+    rsync --archive --delete --verbose --hfs-mode=darsplit \
+        /Users/reynhout fileserver:/backups/mac
+
+(hfs-mode is a patch to rsync, not part of the standard distribution.
+For more info and the latest version, see the web page listed above.)
 
 enddit()
 


More information about the rsync mailing list