bugfix: indeterministic file choice from multiple sources
Dirk Pape
pape at inf.fu-berlin.de
Wed Aug 25 06:44:15 GMT 2004
Hello,
some time ago I reported a bug, where we saw indeterministic behaviour of
rsync (all versions since 2.5), when having the same file appearing in
multiple sources. Sometimes the file in the first source was copied, other
times the file was copied from one of the other sources.
The attached mstest.tgz contains a test to reproduce the behaviour under
darwin and solaris.
The bug did *not* show up in gnu linux versions of rsync, which will be
explained below:
rsync uses the "qsort" system call to compose the entire file list from all
files of all sources. qsort is known to be unstable, meaning that is does
not guarantee the former order, if items to sort have the same value. Our
test case triggers a situation where this unstabilibity shows up.
Why does it not happen in gnu linux versions?
Reading man pages showed us that glibc has an "optimization" in qsort: if
memory is not low it uses mergesort instead, which is a stable sort
algorithm.
fix:
Since in our scenario using rsync we rely on deterministic behaviour, we
patched rsync to use mergesort always for composing the file list. For
systems without a mergesort system call (most os's except freebsd/darwin)
we use the freebsd implementation of mergesort and put it in the source
tree of rsync. patches (relative to 2.6.2) and source are attached.
I want to share this with the public and propose to change rsync to use
mergesort instead of qsort. if this is not mainstream since mergesort has
worse memory complexity, I propose to give users a command line switch to
decide, whether they want to use the feature (prefer reliability for some
scenario over performance) or not.
Hope this will be heared.
Thanks,
Dirk.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mstest.tgz
Type: application/x-gzip
Size: 818 bytes
Desc: not available
Url : http://lists.samba.org/archive/rsync/attachments/20040825/12c22f05/mstest.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patches.tgz
Type: application/x-gzip
Size: 4096 bytes
Desc: not available
Url : http://lists.samba.org/archive/rsync/attachments/20040825/12c22f05/patches.bin
More information about the rsync
mailing list