rsync over Iridium modem, 240 bytes per second

Matt McCutchen matt at mattmccutchen.net
Sun Jan 27 11:33:41 GMT 2008


On Sun, 2008-01-27 at 10:39 +1100, Michael Ashley wrote:
> I'm using rsync to transfer large amounts (megabytes per day!) of
> data over an Iridium modem link (240 bytes per second) from
> Antarctica.
> 
> One problem is that the Iridium link has a mean uptime of perhaps 30
> minutes.
> 
> Implementing partial transfers is crucial, so I was using
> 
> rsync -av --partial --partial-dir=.rsync myfiles user at host:
> 
> The files are pre-compressed with bzip2.
> 
> Question 1: the documentation isn't clear about the interaction
> between
>             the partial and partial-dir switches. And advice?

When both switches are given, --partial-dir overrides --partial.

> The problem with the above command is that the receiving rsync
> processes seem to hang around for a long time, even after the link is
> cut.
> 
> Question 2: is there a signal one can send to a receiving rsync to
>             get it to write out its partial transfers to the
>             partial-dir?

SIGINT should work.

> The next thing I tried was to add "--timeout 1000". This worked
> reasonably well, except that IO buffering makes rsync think that
> the network is dead even though the link is up, and data is trickling
> out at 2400 baud.
> 
> So, I tried "--bwlimit=1". I really need "--bwlimit=0.24", but
> rsync won't allow floating point there.
> 
> This still isn't very satisfactory, and I am still not maximising my
> use of the link.
> 
> Question 3: what should I do?! Any other switches that are relevant
>             to my situation? E.g., "--block-size" (what are the units
>             here? the man page doesn't say).

--block-size is in bytes and controls the size of the blocks matched by
the delta-transfer algorithm.  You probably can't do much better than
the default (approximately the square root of the file size).

> Question 4: will "--partial" save every last byte that makes it
>             through? Or does it truncate to the last "block" (which
> 	    might have taken 20 minutes to come through on a slow link).

>From reading the code, it looks like rsync will truncate to the last
"token", where a token is either a match with a block of the old
destination file or a chunk of literal data of length up to the sending
rsync's CHUNK_SIZE constant, by default 32KB.  You could recompile the
sending rsync with a lower CHUNK_SIZE.

> I guess moving to an rsync daemon rather than ssh transport might be a
> good idea.  I would like to be able to remove all network buffering
> (not sure if this is possible, is the "--blocking-io" switch relevant
> here?), and have rsync realise that the network is dead if nothing
> comes through in, say, 60 seconds. Alternatively, I could use another
> mechanism to determine the link status (an occasional ping?)  and then
> send a SIG_SOMETHING to rsync to get it to clean up nicely and be
> ready for the next connection.

On such a slow, unreliable link, I doubt you will be able to make rsync
work very well by just trimming the buffering and making it time out and
restart.  I would suggest putting some kind of layer on top of the link
that would allow a set of rsync processes to block when the link goes
down and simply resume work when it comes back up.  Someone who knows
more about networking than I do might have more ideas about how to
accomplish this or what else to try.

Matt



More information about the rsync mailing list