--timeout=... lesson learned...

Brian K. White brian at aljex.com
Mon Aug 1 08:07:20 MDT 2011


On 8/1/2011 4:30 AM, Bjorn Madsen wrote:
> I thought other might benefit from this lesson learned and thought it
> maybe should be added to the man-pages.
>
> I thought my network connection was glitchy and hence set rsync up for
> --timeout=120 but I found out that I was actually causing the glitch
> with this script:
> #! /bin/sh -
> while true; do rsync -avz --progress --timeout=120 --delete
> /media/rsync_gb01/movies/ myserver:movies; sleep 120; done
>
> The problem:
> When rsync is checking large files it takes time to verify the content
> at both ends, so when using the option --timeout=TIME both source and
> destination must be capable of performing the check within the time
> window provided. For a 2.3 Gb file this check might take 2:30 minutes,
> so if the option --timeout=120 then rsync will exit before the
> destination has been able to complete its file check.
>
> For the future options: Could/can rsync incorporate the adjustment of
> timeout to include the time used for the check? Fictive example:
>   --timeout=xfer#1-check-time + 120 seconds?
>
> detailed example:
> me at source:~$ rsync -avz --progress
> --timeout=xfer#1-check-time+120 --delete /media/rsync_gb01/movies/
> myserver:movies
> sending incremental file list
> DVD1.mkv
>    2302868295 100%   14.01MB/s *0:02:36* (xfer#1, to-check=41/286)
> DVD2.mkv
> ...and so on...
>
> --
> Bjorn

I think the only way to really address that is to have the two rsync 
processes heartbeat each other.

One side can know that itself is busy and so not decrement remaining 
timeout, but one side cannot know if the other side is killed, hung, 
network disconnected, or just busy, unless we add busy indication to the 
protocol.

Such a heartbeat would need to be programmable. Any arbitrary schedule 
you use will actually break some jobs that would have gone through, even 
though it will keep some jobs alive that would have failed.

Consider an intermittent network. With no heartbeat, the pauses go 
unnoticed, or at least they do not break the rsync session. data 
transfer just runs, pauses, runs, pauses, etc until the job is done.

With heartbeat, the job is aborted whenever the heartbeat schedule is 
broken. Say you define a rule, heartbeat every 20 seconds, abort after 3 
consecutive failed beats. Some connections will be needlessly killed by 
that. You'd need to be able to define the timing and the grace period 
and maybe even fully arbitrary schedules of when to ping and under what 
conditions to abort. It's going to be different for different people and 
different connections and machines and file sets.

Also heartbeat will fix some cases where the ISP has implemented tcp 
session timeouts, or whole net connection timeouts, that the customer 
can not get around. In some cases if a tcp session sees no packets for 2 
minutes, or even shorter!, the session is killed by the ISP's router or 
other upstream hardware outside of the customers control and the next 
packet the application tries to send is when the application discovers 
that the session no longer exists. This is even while the overall net 
connection remains up and busy with other traffic. You can't help this 
by say, pinging the same remote host continuously in parallel while 
rsync is running. The pings would not be part of rsync's tcp session, 
would not make it busy, and would not keep it alive.

So sometimes you want it, sometimes you don't, and when you do, you want 
to be able to specify the timing and rule for aborting.

I've gotten by ok with the existing timeout option and a knowledge of 
how my filesets and net connections behave so it's not a major wish for 
me personally.

-- 
bkw


More information about the rsync mailing list