Feature request, or HowTo? State-full resume rsync transfer
Eberhard Moenkeberg
emoenke at gwdg.de
Mon Jul 11 14:57:06 MDT 2011
Hi once more,
On Mon, 11 Jul 2011, Eberhard Moenkeberg wrote:
> On Mon, 11 Jul 2011, Donald Pearson wrote:
>> I am looking to do state-full resume of rsync transfers.
>>
>> My network environment is is an unreliable and slow satellite
>> infrastructure, and the files I need to send are approaching 10 gigs in
>> size. In this network environment often times links cannot be maintained
>> for more than a few minutes at a time. In this environment, bandwidth is
>> at
>> a premium, which is why rsync was chosen as ideal for the job.
>>
>> The problem that I am encountering while using rsync in these conditions is
>> that the connection between the client and server will drop due to network
>> instability before rsync can transfer the entire file.
>>
>> Upon retries, rsync starts from the beginning. Re-checking data that has
>> already been sent, as well as re-building the checksum in it's entirety
>> every time. Eventually I reach an impasse where the frequency of link loss
>> prevents rsync from ever getting any new data to the destination.
>>
>> I've been reading through the various switches in the man pages to try to
>> find a combination that will work. My thinking was to use a combination of
>> --partial and --append. With the first attempt using the --partial switch,
>> and subsequent attempts using both --partial and --append. The idea being
>> rsync would build a new "partial" file, and be able to resume building that
>> file while making the assumption upon subsequent retries that the existing
>> partial file, however large it may be, was assembled correctly and does not
>> need to be checked.
>>
>> However in practice rsync does not work in this way. I did not find any
>> other switches or methods that would enable rsync to literally pick up
>> where
>> it left off, without destroying the original destination file, so that it's
>> blocks can be used to minimize transferred data and not need to always
>> start
>> from block #1. Such that the aggregate of multiple rsync attempts are able
>> to complete the transfer as a whole while still maintaining the minimum
>> amount of data "on the wire" as if the file was sent in a single rsync
>> session.
>>
>> If this is possible with rsync's current feature set I would be very
>> appreciative of someones time to reply with an example.
>>
>> Or if this is not currently possible, an idea that comes to mind and
>> ultimately a feature request would be to have a switch that tells rsync
>> upon
>> session drop, to do a memory dump of its checksum list, and the last
>> completed block worked on, to a provided file name specified by the switch.
>> This way, with a 2nd switch, rsync can be executed again and will reference
>> this memory dump file, instead of rebuilding a new checksum list, and use
>> that to pick up where it left off or "restore previous state", instead of
>> starting over from block #1.
>
> In my experience, re-checking the already received "partial" blocks takes
> about 3 minutes for a 4 GB partial file.
I forgot to say: over a 56 kbit modem line.
Viele Gruesse
Eberhard Moenkeberg (emoenke at gwdg.de, em at kki.org)
--
Eberhard Moenkeberg
Arbeitsgruppe IT-Infrastruktur
E-Mail: emoenke at gwdg.de Tel.: +49 (0)551 201-1551
-------------------------------------------------------------------------
Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen (GWDG)
Am Fassberg 11, 37077 Goettingen
URL: http://www.gwdg.de E-Mail: gwdg at gwdg.de
Tel.: +49 (0)551 201-1510 Fax: +49 (0)551 201-2150
Geschaeftsfuehrer: Prof. Dr. Oswald Haan und Dr. Paul Suren
Aufsichtsratsvorsitzender: Prof. Dr. Christian Griesinger
Sitz der Gesellschaft: Goettingen
Registergericht: Goettingen Handelsregister-Nr. B 598
-------------------------------------------------------------------------
More information about the rsync
mailing list