killing rsync seems to wipe-out the --partial .hidden files

Frank Hamersley terabite at bigpond.com
Tue Mar 20 02:48:42 MDT 2012


G'day Paul,

Thanks for the prompt response.

It is prolly just me but this behaviour seems rather counter-intuitive!

To give more context to the "business" interest the subject files are
database dumps ... not very large (2G per stripe) and there are 4 in total.
They are uncompressed and do not change greatly each day (except the last
stripe where the page splits on insert get written).

The affected file is not the (last) splitting file ... but one of the "full"
slower moving stripes.  However when the .dotfile was moved to the
_real_filename_ it was apparently truncated to a fair degree - presumably
because the rsync block checking process had (say) only got 20% of the way
through analysing the files.

Based on this --partial behaviour the following inferences are possible ...

1. the dump set in this particular directory is no longer viable as one file
is now abridged.
2. when the next day rsync encounters the affected file there is no "memory"
of the blocks past the point reached when killed the day before - which
condemns rsync to do 100% data copying (no speedup) to complete the file.
3. but no doubt it still has to reparse all the blocks that did make in case
they had changes.

I guess this "behaviour" avoids the need to do a "three way" compare on an
interrupted transfer when checking the blocks not yet resolved.

Thinking quickly (as I have to go to a Mindari) the approach I would take
for --partial is to ...

a. cp -p the _real_filename_ to .filename.wip (nix the space issues as disk
is cheap these days)
b. rsync --inplace from the source file to the .file
c. if restarted after a "crash" simply go from byte 0 again
d. when rsync finished mv the .file to the _real_file (voila)
e. and for icing on top (in respect of dumps) add a new option --partial-all
to express this "simultaneously" for all files in each directory.  Users can
manage files to directories to gain advantage from this switch so it doesn't
need a scope matching the entire rsync'ed file set.

Your thoughts?

Cheers, Frank.

> -----Original Message-----
> From: Paul Slootman [mailto:paul at debian.org]
> Sent: Tuesday, 20 March 2012 4:57 AM
> To: Frank Hamersley
> Cc: rsync list
> Subject: Re: killing rsync seems to wipe-out the --partial .hidden files
>
>
> On Mon 19 Mar 2012, Frank Hamersley wrote:
> >
> > I am running an overnight (off-peak) replication using "rsync --partial"
> > which is "suspended" at 08h30 each morning to allow for day
> time use of the
> > internet link.
> >
> > However it seems the "kill -TERM" on the various rsync (and
> ssh) processes
> > is too aggressive as I can not see any trace of the .hidden
> partial transfer
> > files.  I tried killing only the (parent) script process but
> the rsync/ssh
> > processes just keep running in that case.
>
> Perhaps the manpage isn't entirely clear on this: with --partial, when
> the transfer is interrupted, the .filename.xyz file is moved to the
> _real_ filename.  Hence you don't see any dotfiles.
>
>
> Paul
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1913 / Virus Database: 2114/4880 - Release Date: 03/19/12
>



More information about the rsync mailing list