Feature request, or HowTo? State-full resume rsync transfer

Eberhard Moenkeberg emoenke at gwdg.de
Mon Jul 11 14:13:41 MDT 2011


On Mon, 11 Jul 2011, Donald Pearson wrote:

> I am looking to do state-full resume of rsync transfers.
> My network environment is is an unreliable and slow satellite
> infrastructure, and the files I need to send are approaching 10 gigs in
> size.  In this network environment often times links cannot be maintained
> for more than a few minutes at a time.  In this environment, bandwidth is at
> a premium, which is why rsync was chosen as ideal for the job.
> The problem that I am encountering while using rsync in these conditions is
> that the connection between the client and server will drop due to network
> instability before rsync can transfer the entire file.
> Upon retries, rsync starts from the beginning.  Re-checking data that has
> already been sent, as well as re-building the checksum in it's entirety
> every time.  Eventually I reach an impasse where the frequency of link loss
> prevents rsync from ever getting any new data to the destination.
> I've been reading through the various switches in the man pages to try to
> find a combination that will work.  My thinking was to use a combination of
> --partial and --append.  With the first attempt using the --partial switch,
> and subsequent attempts using both --partial and --append.  The idea being
> rsync would build a new "partial" file, and be able to resume building that
> file while making the assumption upon subsequent retries that the existing
> partial file, however large it may be, was assembled correctly and does not
> need to be checked.
> However in practice rsync does not work in this way.  I did not find any
> other switches or methods that would enable rsync to literally pick up where
> it left off, without destroying the original destination file, so that it's
> blocks can be used to minimize transferred data and not need to always start
> from block #1.  Such that the aggregate of multiple rsync attempts are able
> to complete the transfer as a whole while still maintaining the minimum
> amount of data "on the wire" as if the file was sent in a single rsync
> session.
> If this is possible with rsync's current feature set I would be very
> appreciative of someones time to reply with an example.
> Or if this is not currently possible, an idea that comes to mind and
> ultimately a feature request would be to have a switch that tells rsync upon
> session drop, to do a memory dump of its checksum list, and the last
> completed block worked on, to a provided file name specified by the switch.
> This way, with a 2nd switch, rsync can be executed again and will reference
> this memory dump file, instead of rebuilding a new checksum list, and use
> that to pick up where it left off or "restore previous state", instead of
> starting over from block #1.

In my experience, re-checking the already received "partial" blocks takes 
about 3 minutes for a 4 GB partial file.

Viele Gruesse
Eberhard Moenkeberg (emoenke at gwdg.de, em at kki.org)

Eberhard Moenkeberg
Arbeitsgruppe IT-Infrastruktur
E-Mail: emoenke at gwdg.de      Tel.: +49 (0)551 201-1551
Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen (GWDG)
Am Fassberg 11, 37077 Goettingen
URL:    http://www.gwdg.de             E-Mail: gwdg at gwdg.de
Tel.:   +49 (0)551 201-1510            Fax:    +49 (0)551 201-2150
Geschaeftsfuehrer:         Prof. Dr. Oswald Haan und Dr. Paul Suren
Aufsichtsratsvorsitzender: Prof. Dr. Christian Griesinger
Sitz der Gesellschaft:     Goettingen
Registergericht:           Goettingen  Handelsregister-Nr. B 598

More information about the rsync mailing list