keep rsync from removing unfinished source files?

Matt McCutchen matt at mattmccutchen.net
Mon Sep 8 02:26:28 GMT 2008


On Sun, 2008-09-07 at 16:42 -0400, Matt McCutchen wrote:
> On Sun, 2008-09-07 at 16:16 -0400, Aaron Swartz wrote:
> > > IMO, a proper solution is to have the crawler indicate somehow which
> > > files are unfinished so rsync can avoid copying those.  E.g., the
> > > crawler could name unfinished files according to a special pattern so
> > > that you could exclude them with --exclude, or it could keep them in a
> > > temporary directory that rsync doesn't visit.
> > 
> > I agree, but this is not how most crawlers are written. (Imagine, e.g. wget.)
> 
> Then I would modify the crawler.

I hacked together a modified version of Fedora's wget that has an option
--temp-file-suffix that you could use to make unfinished files
excludable.  The patch is attached.  I also made some RPMs, which are
available at:

http://mattmccutchen.net/private/wget-temp-file-suffix/

I ran a little test, and the option does seem to stop "rsync
--remove-source-files" from deleting unfinished files when combined with
an appropriate --exclude, but I won't claim that it will play nicely
with all wget options.

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wget-temp-file-suffix.patch
Type: text/x-patch
Size: 4169 bytes
Desc: not available
Url : http://lists.samba.org/archive/rsync/attachments/20080907/f4cf589b/wget-temp-file-suffix.bin


More information about the rsync mailing list