[Bug 1463] New: poor performance with large block size
Chris Shoemaker
c.shoemaker at cox.net
Wed Jul 14 22:27:45 GMT 2004
On Thu, Jun 17, 2004 at 08:47:57PM -0700, Craig Barratt wrote:
>
> > But, the comment seems to have been right on. I have re-run the
> > experiment with block sizes as small as 3000 (yes it took a long
> > time to complete) all the way up to block sizes of 100000 with it
> > working in reasonable times. But, when the block size approaches
> > 170,000 or so, the performance degrades exponentially.
> >
> > I understand that I am testing at the very fringes of what we should
> > expect rsync to do. File sizes of 25Gig and 55Gig are beyond what was
> > originally envisioned (based on 64k hash buckets and a sliding window
> > of 256k).
>
> Here's a patch to try. It basically ensures that the window is
> at least 16 times the block size. Before I'd endorse this patch
> for CVS we need to make sure there aren't cases where map_ptr is
> called with a much bigger length, making the 16x a bit excessive.
>
> Perhaps I would be tempted to repeat the previous check that the
> window start plus the window size doesn't exceed the file length,
> although it must be at least offset + len - window_start as in
> the original code.
hehe, I'm catching up to you guys. I'm kinda late. That's what I get
for letting my email back up for >1 month. :)
>
> In any case, I'd be curious if this fixes the problem.
>
> Craig
>
> --- rsync-2.6.2/fileio.c Sun Jan 4 19:57:15 2004
> +++ ../rsync-2.6.2/fileio.c Thu Jun 17 19:33:26 2004
> @@ -193,8 +193,8 @@
> if (window_start + window_size > map->file_size) {
> window_size = map->file_size - window_start;
> }
> - if (offset + len > window_start + window_size) {
> - window_size = (offset+len) - window_start;
> + if (offset + 16 * len > window_start + window_size) {
> + window_size = (offset + 16 * len) - window_start;
> }
My initial reaction (having not actually read the code) is that it would
be desirable make the window_size highly composite, and then ensure that
the block size is an integer factor of the window_size. In other words,
avoid the memmoves altogether.
{/me thinks a bit}
Actually, I don't think this removes the pathological case at all. It just
reduces the frequency of the impact by a factor of 16. Consider when
len = window_size/16 - 1. We'll still end up with a memmove of size
len-16, which is expensive when len is large, despite window_size being
16 times larger.
Besides making window_size mod len = 0, another solution could be to use
a circular buffer.
This is a pretty nasty bug you guys found. I hope we can fix it soon.
-chris
>
> /* make sure we have allocated enough memory for the window */
> --
> To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
More information about the rsync
mailing list