[PATCH] --omit-dir-changes, qsort<>mergesort issues

Antti Tapaninen aet at cc.hut.fi
Fri Jun 2 17:39:30 GMT 2006


Hi all,

I recently ran into some problems with rsync. My plan is to renew some of 
our old administration concepts from early 90's, I already replaced rdist 
with rsync a few years ago.

Because of the rdist legacy, the current method requires synchronizing 
files into 6 different locations, {/alt,/usr/alt}/{hostdep,sysdep,hutdep}, 
which in turn are prioritized by a tool that just symlinks the files to 
actual paths at /etc, /usr/bin, etc. Existing vendor files are backed up 
to /alt/vendor and /usr/alt/vendor and restored when no replacements are 
no longer available.

A hierarchy like this is problematic if you want to make anything faster 
or provide a pull service for hosts that are not available 24 hours a day, 
so I'm replacing the paths with a single destination directory /alt/root. [1]

Thanks to reducing the number of ssh connections from 6 to 1, and the 
absence of precalculating /alt/sysdep contents to /tmp because of [1], the 
synchronizing operation in push/pull mode takes 1-4 seconds. A lot faster 
than before and no extra IO created.

In the current and upcoming use scenario, any host can have a directory 
/alt/local where you can add temporary modifications before adding them to 
the central file server.

After /alt/root is synchronized by a push/pull method, another tool reads 
the previous index of files, finds the differences between old and new 
state of /alt/{root,local} and then syncronizes them to hosts real root 
directory. Unlike before, actual copies of files are used.

The sync process includes 4 stages, 3 for rsync itself:

- Sync /alt/root with backups to /alt/backup/YYYYMMDD.MMHH directory,
   ignore some files. (checksum hasn't changed, etc)
- Full sync without ignores
- unlink() files that were deleted between old and
   new /alt/root state.
- Restore possible backups from /alt/backup directories
   with --remove-sent-files. [2] Ignore files that are known
   to be unmodified or were just added.

The problems that I've encountered so far are: 1) rsync doesn't seem to 
preserve permissions/ownership for directories that are created in the 
backup directory. 2) synchronizing in archive mode, but not modifying any 
existing directory status is impossible or just too difficult for me. :)

I can live with the first issue, but I have troubles later, if restoring 
backups from the multiple backup directories also overrides the 
permission/ownership of the existing directories with the default ones 
that rsync has created just for backups.

I tried to work around with various exclude/filter rules without a 
success. --files-from could work in some ways, but it restricts the number 
of source directories to one.

The attached patch is something that I quickly whipped together a few 
hours ago, seems to work ok. Besides helping with the restoring issue, I 
think it's also a necessity in our use case.

The files/directories at /alt/root are gathered from multiple sources in a 
multi-vendor unix(4)/linux(3) environment, just about all you can truly 
trust your administrators to remember, is to take care of file 
permissions. The chance of shooting the whole system down because of a bad 
directory permission/ownership is too likely, at least I would sleep a lot 
better if there's a sure method not set directory permission/ownership.

I'd love to hear any feedback about the patch or if doesn't work correctly 
in some scenario.

Cheers,
-Antti

[1] Reliable sync from multiple src roots requires a custom build of rsync
with mergesort() built-in. I'm having the same problem as discussed
in thread at http://lists.samba.org/archive/rsync/2004-August/010398.html
and http://lists.samba.org/archive/rsync/2003-November/007756.html.

There's another qsort<>mergesort testing tool available at 
http://www.hut.fi/~aet/test-rsync.sh

[2] Again.. a sort issue if the same file is found from multiple backup 
directories. The /alt/backup/YYYYMMDD.HHMM directories are given in 
reverse order, which should always restore the latest backup file.
-------------- next part --------------
Index: generator.c
===================================================================
RCS file: /cvsroot/rsync/generator.c,v
retrieving revision 1.282
diff -u -r1.282 generator.c
--- generator.c	1 Jun 2006 08:04:40 -0000	1.282
+++ generator.c	2 Jun 2006 13:00:01 -0000
@@ -45,6 +45,7 @@
 extern int preserve_gid;
 extern int preserve_times;
 extern int omit_dir_times;
+extern int omit_dir_changes;
 extern int delete_mode;
 extern int delete_before;
 extern int delete_during;
@@ -348,10 +349,11 @@
 			iflags |= ITEM_REPORT_TIME;
 		if ((file->mode & CHMOD_BITS) != (st->st_mode & CHMOD_BITS))
 			iflags |= ITEM_REPORT_PERMS;
-		if (preserve_uid && am_root && file->uid != st->st_uid)
+		if (preserve_uid && am_root && file->uid != st->st_uid
+		    && !(S_ISDIR(st->st_mode) && omit_dir_changes))
 			iflags |= ITEM_REPORT_OWNER;
-		if (preserve_gid && file->gid != GID_NONE
-		    && st->st_gid != file->gid)
+		if (preserve_gid && file->gid != GID_NONE && st->st_gid != file->gid
+		    && !(S_ISDIR(st->st_mode) && omit_dir_changes))
 			iflags |= ITEM_REPORT_GROUP;
 	} else
 		iflags |= ITEM_IS_NEW;
@@ -891,7 +893,7 @@
 
 	/* If we're not preserving permissions, change the file-list's
 	 * mode based on the local permissions and some heuristics. */
-	if (!preserve_perms) {
+	if (!preserve_perms || (S_ISDIR(st.st_mode) && omit_dir_changes)) {
 		int exists = statret == 0
 			  && S_ISDIR(st.st_mode) == S_ISDIR(file->mode);
 		file->mode = dest_mode(file->mode, st.st_mode, exists);
Index: options.c
===================================================================
RCS file: /cvsroot/rsync/options.c,v
retrieving revision 1.345
diff -u -r1.345 options.c
--- options.c	1 Jun 2006 08:04:47 -0000	1.345
+++ options.c	2 Jun 2006 13:00:01 -0000
@@ -55,6 +55,7 @@
 int preserve_gid = 0;
 int preserve_times = 0;
 int omit_dir_times = 0;
+int omit_dir_changes = 0;
 int update_only = 0;
 int cvs_exclude = 0;
 int dry_run = 0;
@@ -311,6 +312,7 @@
   rprintf(F," -D                          same as --devices --specials\n");
   rprintf(F," -t, --times                 preserve times\n");
   rprintf(F," -O, --omit-dir-times        omit directories when preserving times\n");
+  rprintf(F,"     --omit-dir-changes      omit directories when preserving any attributes\n");
   rprintf(F,"     --super                 receiver attempts super-user activities\n");
   rprintf(F," -S, --sparse                handle sparse files efficiently\n");
   rprintf(F," -n, --dry-run               show what would have been transferred\n");
@@ -425,6 +427,7 @@
   {"no-times",         0,  POPT_ARG_VAL,    &preserve_times, 0, 0, 0 },
   {"no-t",             0,  POPT_ARG_VAL,    &preserve_times, 0, 0, 0 },
   {"omit-dir-times",  'O', POPT_ARG_VAL,    &omit_dir_times, 2, 0, 0 },
+  {"omit-dir-changes", 0,  POPT_ARG_VAL,    &omit_dir_changes, 1, 0, 0 },
   {"modify-window",    0,  POPT_ARG_INT,    &modify_window, OPT_MODIFY_WINDOW, 0, 0 },
   {"super",            0,  POPT_ARG_VAL,    &am_root, 2, 0, 0 },
   {"no-super",         0,  POPT_ARG_VAL,    &am_root, 0, 0, 0 },
@@ -1285,6 +1288,9 @@
 			"P *%s", backup_suffix);
 		parse_rule(&filter_list, backup_dir_buf, 0, 0);
 	}
+
+	if (omit_dir_changes)
+		omit_dir_times = 2;
 	if (make_backups && !backup_dir)
 		omit_dir_times = 1;
 
@@ -1513,6 +1519,8 @@
 			argstr[x++] = 'm';
 		if (omit_dir_times == 2)
 			argstr[x++] = 'O';
+		if (omit_dir_changes == 1)
+			args[ac++] = "--omit-dir-changes";
 	} else {
 		if (copy_links)
 			argstr[x++] = 'L';
Index: receiver.c
===================================================================
RCS file: /cvsroot/rsync/receiver.c,v
retrieving revision 1.181
diff -u -r1.181 receiver.c
--- receiver.c	1 Jun 2006 08:04:50 -0000	1.181
+++ receiver.c	2 Jun 2006 13:00:01 -0000
@@ -37,6 +37,7 @@
 extern int relative_paths;
 extern int preserve_hard_links;
 extern int preserve_perms;
+extern int omit_dir_changes;
 extern int basis_dir_cnt;
 extern int make_backups;
 extern int cleanup_got_literal;
@@ -541,7 +542,7 @@
 
 		/* If we're not preserving permissions, change the file-list's
 		 * mode based on the local permissions and some heuristics. */
-		if (!preserve_perms) {
+		if (!preserve_perms || (S_ISDIR(st.st_mode) && omit_dir_changes)) {
 			int exists = fd1 != -1;
 			file->mode = dest_mode(file->mode, st.st_mode, exists);
 		}
Index: rsync.c
===================================================================
RCS file: /cvsroot/rsync/rsync.c,v
retrieving revision 1.194
diff -u -r1.194 rsync.c
--- rsync.c	1 Jun 2006 08:04:40 -0000	1.194
+++ rsync.c	2 Jun 2006 13:00:01 -0000
@@ -37,6 +37,7 @@
 extern int preserve_executability;
 extern int preserve_times;
 extern int omit_dir_times;
+extern int omit_dir_changes;
 extern int am_root;
 extern int am_server;
 extern int am_sender;
@@ -160,9 +161,11 @@
 			updated = 1;
 	}
 
-	change_uid = am_root && preserve_uid && st->st_uid != file->uid;
+	change_uid = am_root && preserve_uid && st->st_uid != file->uid
+		&& !(S_ISDIR(st->st_mode) && omit_dir_changes);
 	change_gid = preserve_gid && file->gid != GID_NONE
-		&& st->st_gid != file->gid;
+		&& st->st_gid != file->gid
+		&& !(S_ISDIR(st->st_mode) && omit_dir_changes);
 #if !defined HAVE_LCHOWN && !defined CHOWN_MODIFIES_SYMLINK
 	if (S_ISLNK(st->st_mode))
 		;


More information about the rsync mailing list