weird interaction between --delete-delay and --partial-dir when transfer is interrupted

Shivkumar Venkatasubrahmanyam svenkata at stanford.edu
Mon Dec 22 04:48:36 GMT 2008


Michal Soltys wrote:
> Shivkumar Venkatasubrahmanyam wrote:
>> Hi,
>>
>> I'm not sure if this is a bug, but after reading the manual, this 
>> does not seem like expected behavior.  I'm using the following rsync 
>> command to approximate an atomic update (I can't use --link-dest as 
>> hard links in hfsplus filesystems are fubar under linux as of 2.6.27):
>>
>> rsync -a --delete-excluded --backup --backup-dir=../bar.archive 
>> --backup-suffix=.2008-12-15-0115 --delete-delay --delay-updates 
>> --partial-dir=.rsync-partial --verbose --human-readable 
>> --log-file=/bar.rsync-log host:/foo/ /bar
>>
>> If a transfer is interrupted, .rsync-partial/ directories are left 
>> over in the backup directory, as expected. The next rsync run exits 
>> with non-zero exit status even though all files are transferred 
>> (verified this using md5sum). A third rsync run transfers nothing 
>> (going by the log file) but exits with success.  I'm guessing this is 
>> because: (1) --delete-excluded causes the .rsync-partial/ directories 
>> to be marked for deletion during file list generation time of the 
>> second rsync run (there are no .rsync-partial/ directories in 
>> host:/foo/), (2) successful transfer of all files during the second 
>> run results in rsync removing the .rsync-partial/ directories as they 
>> become empty, and (3) --delete-delay which I understand is necessary 
>> to allow rsync to use the files in .rsync-partial/ to speed up the 
>> second transfer, fails to find the .rsync-partial/ directories at the 
>> end of the transfer and complains.  The third transfer sees no 
>> .rsync-partial/ directories in /bar and so returns success.
>>
>> I would like the second run to return success if all files are 
>> transferred.  Is there some option I'm supposed to use (e.g. 
>> --filter) or is this a bug?
>>
>> Shiv
>
> What is the exact exit value after the 2nd run - 23 perhaps ?
Yes, its 23 ("partial transfer due to error").
>
> You can do a simple test:
>
> touch /test/src/a
> mkdir /test/dst/.rp
> rsync -a -vvvv --delete-excluded --delay-updates --delete-delay 
> --partial-dir=.rp /test/src/ /test/dst/ | less
>
> In this case, rsync will exit with value 23. It looks like rsync is 
> deleting .rp before any file transfer starts.
>
> If you put any bogus file in .rp before rsync call, it will exit with 
> 0 though. Looks like it prevents it from a deletion.
>
> If you remove delete-excluded, it will work as expected with or w/o 
> bogus file (with bogus file, .rp dir will not get deleted, otherwise 
> it will), exit value is 0 in both cases.
>
>
> It does look like sort of a tiny bug. Or maybe there's something 
> subtle we're missing.
>
>
> Either way - as a workaround, you could drop delete-excluded, and then 
> just finalize the thing with something like: find /bar/ -depth -name 
> ".rsync-partial" -exec rm -rf "{}" \;
>
> Rsync will usually clean up after itself - only bogus files (not being 
> part of the transfer) would prevent it from doing so, as far as I can 
> see.
Makes sense.  And this does seem like a corner case :)  I did some more 
testing along the lines you suggested ...

mkdir src ; touch src/a
mkdir dst ; mkdir dst/.rp
rsync -a -vvvv --delete-excluded --delete-delay --delay-updates 
--partial-dir=.rp src/ dest/ >log 2>&1

It seems this bug/corner case requires all the following conditions:
(1) left-over partial dir(s) from an prior rsync run (dst/.rp in this 
case or .~tmp~ if --partial-dir is not specified)
(2) --delete-excluded, which unlike --delete does not "protect" .rp so 
dst/.rp is added to the list of files to delete
(3) --delete-delay, which postpones all deletes (incl. dst/.rp till 
after new and modified files are transferred)
(4) --delay-updates, which during the "transfer phase", transfers src/a 
to dst/.rp/a then moves dst/.rp/a to dst/a and deletes dst/.rp since it 
is now empty.  So basically two operations try to delete dst/.rp, the 
first one succeeds so the second one (during the "delete phase") fails, 
causing the error (code 23).

I use all three of these options: --delete-excluded because I have added 
"--filter" options over time and I want the destination to reflect these 
newly excluded files, --delete-delay and --delay-updates so that if the 
connection is lost, I don't end up with a backup that's part-old and 
part-new.  Code 23 could also result from a "genuine" failure to 
transfer some file (I sometimes lose my connection to the remote 
"source") so the only way to know that the transfer was completed is to 
run rsync again until it exits with code 0. For now, re-running rsync 
seems like the easiest workaround but it would be nice if automatic 
deletion of partial-dirs during the "transfer phase" (--delay-updates) 
would also check for and remove these same partial-dirs from the 
"post-transfer delete phase" (--delete-excluded --delete-delay).  Its 
probably more work than a corner case is worth :)

Shiv


More information about the rsync mailing list