atomic transaction set option to rsync

Wayne Davison wayned at samba.org
Tue Jan 4 03:07:02 GMT 2005


On Tue, Jan 04, 2005 at 02:51:23AM +0100, Dag Wieers wrote:
> In the past I could say smt. like:
> 
> 	rsync -a dir1/ dir2/ user at rsync:/remote-dir/
> 
> and it would process first dir1 and then dir2.

The filenames read in from dir1 and dir2 have always been sorted into a
single list of files, so dir1's files will only be sent prior to dir2's
files if they sort alphabetically earlier in the list.

> This way I first added the packages/ dir (which contains hardlinks of only 
> the packages) and then the repository (packages+metadata).

Ahh, now there's a difference that is affected by the 2.6.x series:
hard-link handling.  If the first instance of a hard-link is not found,
rsync holds off on sending the file in the hopes that one of the other
links for the file will match up with an existing file on the receiving
side.  This avoids a bug where a new hard-link can cause rsync to
re-send all the file's data just because it sorted alphabetically
earlier in the list than the other (existing) link(s).

> I can ask mirrors to use '--atomic' or '--atomic-ts', bt I can't ask
> them to re-organise their mirror-scripts just for me.

Since you'd have to ask them to install a new rsync, maybe just ask them
to install the attached perl script instead.  Then, they could run
"atomic-rsync ..." instead of their current "rsync ..." command.  (The
attached script works if they're doing a pull.)

The idea of doing a massive number of renames a the end of the transfer
is interesting, but it is not as atomic as the algorithm implemented by
the above script.  However, if you'd prefer going that route, I'd
imagine the implementation sharing a lot of the code that --partial-dir
uses.  E.g., add an --atomic-dir=.atomic option that causes all finished
files to be saved off in the .atomic dir (relative to their destination)
and then add an ending pass that goes back through the file list and
renames all the .atomic/FOO files.  Something like that should be pretty
easy to whip up.

..wayne..
-------------- next part --------------
#!/usr/bin/perl

use strict;
use Cwd 'abs_path';

my $RSYNC = '/usr/bin/rsync';

my $dest_dir = $ARGV[-1];
usage(1) if $dest_dir eq '' || !-d $dest_dir
	 || grep(/^--(link|compare)-dest/, @ARGV);
$dest_dir = abs_path($dest_dir);
usage(1) if $dest_dir eq '/';

my $old_dir = "$dest_dir~old~";
my $new_dir = $ARGV[-1] = "$dest_dir~new~";

if (-d $old_dir) {
    rename($old_dir, $new_dir) or die "Unable to rename $old_dir to $new_dir: $!";
}

if (system($RSYNC, "--link-dest=$dest_dir", @ARGV)) {
    if ($? == -1) {
	print "failed to execute $RSYNC: $!\n";
    } elsif ($? & 127) {
	printf "child died with signal %d, %s coredump\n",
	    ($? & 127),  ($? & 128) ? 'with' : 'without';
    } else {
	printf "child exited with value %d\n", $? >> 8;
    }
    exit $?;
}

rename($dest_dir, $old_dir) or die "Unable to rename $new_dir to $old_dir: $!";
rename($new_dir, $dest_dir) or die "Unable to rename $new_dir to $dest_dir: $!";

exit;


sub usage
{
    my($ret) = @_;
    print <<EOT;
Usage: atomic-rsync [RSYNC-OPTIONS] HOST:SOURCE DEST

See rsync for its list of options.  You may not use --link-dest
or --compare-dest, however.  Also, DEST must not be "/".
EOT
    exit $ret;
}


More information about the rsync mailing list