[Bug 11474] New: Retry delay for lost connection

Sun Aug 30 21:35:26 UTC 2015

https://bugzilla.samba.org/show_bug.cgi?id=11474

            Bug ID: 11474
           Summary: Retry delay for lost connection
           Product: rsync
           Version: 3.1.2
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: core
          Assignee: wayned at samba.org
          Reporter: samba at haravikk.com
        QA Contact: rsync-qa at samba.org

Currently when a connection is lost, rsync will abort the rest of the transfer,
forcing it to be started from the beginning once again. Since rsync is pretty
much designed around transferring only changed files this isn't usually a big
deal, however with very large (in size or quantity) transfers, or transfers
over slower connections, resuming can be a very slow process; a backup that
might take four hours and gets interrupted, could take two or three hours to
actually resume as it runs through previously transferred files necessarily.

What I would like to propose is that a new option be added that will cause
rsync to wait instead of failing when a connection is lost, and instead try to
re-establish the connection at periodic intervals until that time limit is
reached. If the connection is re-established then the transfer will resume
where it left off, thus skipping any previous files.

I'm uncertain how difficult this would be to implement, as it shouldn't really
matter to the receiving side at what point the sender begins (it'll just treat
it like a brand new transfer that just happens to begin at that point). The
main question mark that I can think of is how delayed actions (like deletions)
are handled; if some files are tracked only on the receiving side then the
sender may need to be able to track what the receiver should know, so that it
can be sent as part of the "new" transfer. If this would be too complex, the
simplest option might be to have this flag require the use of --delete-during
or --delete-before instead of --delete-after or --delete-delay, with similar
treatment of any other affected options.

I think that connection issues are probably one of the more common reasons for
very large transfers to fail, and being able to have rsync handle these itself
would make things a lot easier, and faster. It would also make scripting rsync
error handling simpler, as most other errors represent a fault that needs to be
addressed separately, rather than something that can be solved by retrying.
I've seen several scripts that incorrectly put rsync transfers in a loop so
they will immediately retry on encountering any error; while this may look neat
and be fine for connection issues, it's no good for errors such as incompatible
file-names, since these require user intervention, and don't actually stop the
transfer (only each single file that fails).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.