Feature request, or HowTo? State-full resume rsync transfer

Donald Pearson donaldwhpearson at gmail.com
Fri Jul 15 11:10:20 MDT 2011


Eberhard,

I looked in to the -b switch and it's a good idea but I have been unable to
find a way to use it such that a resume can continue where it left off,
without re-checking what has already been completed, *and* continue to use
the same ultimate destination file as a source of diffing.

Looking in to this lead me to the --partial-dir option which really looked
promising, as it will both use partial transfers to speed up subsequent
transfers, as well as continue to use the same ultimate desination file as a
source of diffing, but there's no way to tell rsync not to re-check the
partial file that exists in the partial-dir.

--append is incompatible with --partial-dir. :(

If I had a way to instruct rsync to perform a --partial-dir behavior, but do
not not verify the partial file, that would be exactly what I am looking
for.

Brian,

Yes BT solves many of the delivery problems.  I've taken a hard look at BT
as an option.  The trouble is, as best as I can tell, BT is incapable of
diffing like rsync does.

I have considered using BT in combination with a binary diffing utility such
as xdelta3, however the patch files that xdelta3 is creating are rather
large.  With source and desination files ~ 8 gigs in size, the xdelta3 diff
patch files are coming out to ~3.5gigs.  Better, but rsync can do the job in
a little over 1gig, when it is able, so it's hard to sell 3x that traffic.

Matthias,

A vpn tunnel is an interesting idea.  Do you know how long you're able to
keep rsync in limbo before it will give up?

I have verified that as long as both the rsync client and server processes
remain active and the sockets remain open, the physical connection between
the two can be severed and reconnected and transmission will resume
upon re-connection.  According to rsync's documentation, when a --timeout
option is not giving, rsync will not time out natively.

The issue I think is keeping the sockets open, thereby keeping the processes
active.

When the socket closes I receive an error like this;

sims at SIMS-Usingen:/images/temp$ rsync -zB=512 AllCafeFinal_2011-06-28
sims at 10.67.2.201:/images/temp/AllCafeFinalTest

Read from remote host 10.67.2.201: Connection reset by peer

rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken
pipe (32)

rsync: connection unexpectedly closed (28 bytes received so far) [sender]

rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]

My first guess was the tcp_fin_timeout setting of the Ubuntu operating
system, but this is set for 60 seconds.  Leaving everything alone I see
roughly 6 minutes before the rsync client errors out.

Knowing that rsync is running over ssh I've experimented with altering the
ssh TCPKeepAlive settings on the source(/etc/ssh/ssh_config) and destination
(/etc/ssh/sshd_config) to explicitly "no",  but nevertheless I will
eventually encounter the error.  The longest period of time that I've seen
the rsync client remain in "limbo" is roughly 16 minutes.

Normaly 16 minutes is an eternity but my end points are connected via
satellite and subject to all kinds of strange things such as sand storms,
that can knock out coms for hours sometimes.

If truly resuming with rsync is not possible, is it possible to configure my
source and destination in such a way that the TCP sessions will never be
torn down due to timeouts?   Keep in mind there is no NAT between source and
destination so there are no middle-man NAT state tables to worry about.

Thanks again for everybody's help and by all means keep the ideas coming.  I
am trying new angles as I come up with them, I haven't given up on this yet.

Regards,
Donald

On Tue, Jul 12, 2011 at 1:13 PM, Matthias Schniedermeyer <ms at citd.de> wrote:

> On 12.07.2011 11:10, Donald Pearson wrote:
>
> ...
>
> A 'trick' i personally use for an unreliable connection is an
> OpenSSH-Tunnel.
>
> Altough any VPN-solution should to the trick.
>
> That way the connection between the two rsync-halvs isn't directly tied
> to the internet-connection.
>
> In my case that means that when the internet-connection drops the
> OpenSSH-Tunnel 'dies' (Assured/Expided by a relative low
> 'ServerAliveInterval' & 'ClientAliveInterval') but as the rsync
> connection isn't directly tied to the internet-conncetion, Linux keeps
> that connction 'hanging'. After reconnecting the OpenSSH-Tunnel the
> rsync connection resumes when Linux realizes that the destination can be
> reached again.
>
> This also abstracts away problems with Dynamic-IPs.
>
>
>
>
>
> Bis denn
>
> --
> Real Programmers consider "what you see is what you get" to be just as
> bad a concept in Text Editors as it is in women. No, the Real Programmer
> wants a "you asked for it, you got it" text editor -- complicated,
> cryptic, powerful, unforgiving, dangerous.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20110715/50c22be2/attachment.html>


More information about the rsync mailing list