rsyncing many files and hard links: optimisation suggestions?
jamie at shareable.org
Fri Sep 29 10:42:47 GMT 2006
Judith Retief wrote:
> If the problem is the actual disk access, then I can't think of anything to
> do. If it is the sorting, then cutting down the batch sizes should help, at
> the expense of having copies of some files rather than hard links.
You can tell whether it's the disk accesses or the sorting/kernel by
looking at the CPU usage while it's running (e.g. using "top"). If
it's 100% CPU in user space, there may be scope for optimising rsync's
code. If it's 100% CPU in the kernel, there's not a lot you can do -
you might be able to use a different strategy for checking for changed
files than just scanning them all, though. If it's not close to 100%
total, it's disk I/O, in which case you might be able to preload the
disk's inode tables into cache to reduce head-seek time. A program
called "treescan" helps with that.
More information about the rsync