bugfix: indeterministic file choice from multiple sources

Dirk Pape pape at inf.fu-berlin.de
Wed Aug 25 06:44:15 GMT 2004


some time ago I reported a bug, where we saw indeterministic behaviour of 
rsync (all versions since 2.5), when having the same file appearing in 
multiple sources. Sometimes the file in the first source was copied, other 
times the file was copied from one of the other sources.

The attached mstest.tgz contains a test to reproduce the behaviour under 
darwin and solaris.

The bug did *not* show up in gnu linux versions of rsync, which will be 
explained below:

rsync uses the "qsort" system call to compose the entire file list from all 
files of all sources. qsort is known to be unstable, meaning that is does 
not guarantee the former order, if items to sort have the same value. Our 
test case triggers a situation where this unstabilibity shows up.

Why does it not happen in gnu linux versions?

Reading man pages showed us that glibc has an "optimization" in qsort: if 
memory is not low it uses mergesort instead, which is a stable sort 


Since in our scenario using rsync we rely on deterministic behaviour, we 
patched rsync to use mergesort always for composing the file list. For 
systems without a mergesort system call (most os's except freebsd/darwin) 
we use the freebsd implementation of mergesort and put it in the source 
tree of rsync. patches (relative to 2.6.2) and source are attached.

I want to share this with the public and propose to change rsync to use 
mergesort instead of qsort. if this is not mainstream since mergesort has 
worse memory complexity, I propose to give users a command line switch to 
decide, whether they want to use the feature (prefer reliability for some 
scenario over performance) or not.

Hope this will be heared.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: mstest.tgz
Type: application/x-gzip
Size: 818 bytes
Desc: not available
Url : http://lists.samba.org/archive/rsync/attachments/20040825/12c22f05/mstest.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patches.tgz
Type: application/x-gzip
Size: 4096 bytes
Desc: not available
Url : http://lists.samba.org/archive/rsync/attachments/20040825/12c22f05/patches.bin

More information about the rsync mailing list