Nice little performance improvement

Mike Connell mikefconnell at verizon.net
Thu Oct 15 20:07:28 MDT 2009


Hi,

In my situation I'm using rsync to backup a server with (currently) about 570,000 files.
These are all little files and maybe .1% of them change or new ones are added in
any 15 minute period.

I've split the main tree up so rsync can run on sub sub directories of the main tree. 
It does each of these sub sub directories sequentially. I would have liked to run 
some of these in parallel, but that seems to increase i/o on the main server too much.


Today I tried the following:

For all subsub directories
    a) Fork a "du -s subsubdirectory" on the destination subsubdirectory
    b) Run rsync on the subsubdirectory
    c) repeat untill done

Seems to have improved the time it takes by about 25-30%. It looks like the du can
run ahead of the rsync...so that while rsync is building its file list, the du is warming up
the file cache on the destination. Then when rsync looks to see what it needs to do
on the destination, it can do this more efficiently.

Looks like a keeper so far. Any other suggestions? (was thinking of a previous
suggestion of setting /proc/sys/vm/vfs_cache_pressure to a low value).

Thanks,

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20091015/58056c0d/attachment.html>


More information about the rsync mailing list