proposal to speed rsync with lots of files
lanclos at ucolick.org
Thu Mar 5 23:43:47 GMT 2009
Peter Salameh wrote:
> One of the speed-limiting issues with rsync is having to send huge file
> lists when mirroring large file systems, even for incremental updates
> where only a small part of the file system might have changed.
Personally, I find that the sending of the file list, whether incremental
or otherwise, takes orders of magnitude less time than the construction of
the file list in the first place. The act of stat'ing millions of files
takes an enormous amount of time in comparison to just about anything else,
assuming that you are not on a low-bandwidth link.
What would be ideal, I think, is for rsync to scan the filesystem while
a transfer is in place; I am completely ignorant of rsync's innards in
this regard, but it seems like a basic producer/consumer algorithm, with
a configurable quantity of file transfer threads, combined with a
configurable quantity of filesystem "spider" threads, would result in the
most optimal interleaving of disk latency and time required to transfer files.
More information about the rsync