proposal to speed rsync with lots of files

Mag Gam magawake at gmail.com
Thu Mar 12 02:23:01 GMT 2009


Using inotify with rsync is a great idea.

If one has a job that runs daily to get differences on a very large
filesytem with very small files, then can do this (assuming the
initial copy is already completed):
inotify watch source filesystem (or tree)
take down all the notices in a txt file (absolute path)
use rsync with the results from the txt file and place them in the
destination repository
re-resync again to be 100% sure.

I like this idea.




On Fri, Mar 6, 2009 at 11:58 AM, Wayne Davison <wayned at samba.org> wrote:
> On Thu, Mar 05, 2009 at 03:27:50PM -0800, Peter Salameh wrote:
>> My proposal is to first send a checksum of the file list for each
>> directory.  If is found to be identical to the same checksum on the
>> remote side then the list need not be sent for that directory!
>
> My rZync source does something like that for directories:  it treats a
> directory-list transfer like a file transfer.  That means that the
> receiving side sends a set of checksums to the sending side telling it
> what it's version of the directory looks like, and then the sender sends
> a normal set of delta data that lets the receiver reconstruct the
> sender's version of the directory (which it compares to its own).  One
> potential drawback is having to deal with false checksum-matches (which
> should be rare, but would require the dir data to be resent) I hadn't
> optimized it for block size or (possibly) data order to make it more
> efficient, but it is an interesting idea for speeding up a slow
> connection.  I'm not sure if it would really help out that much for a
> more modern, faster connection, because rsync sends the file-list data
> at the same time as it is being scanned, and sometimes the scan is the
> bottle-neck.
>
> The best way to optimize sending of really large numbers of files that
> are mostly the same is to start to leverage a file-change notification
> system, such as inotify.  Using that, it is possible to distill a list
> of what files/directories need to be copied, and to just copy what is
> needed.
>
> ..wayne..
> --
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
>


More information about the rsync mailing list