File corruptions with rsync version 2.6.9 on 64-bit openSUSE 10.3

Herve Pages hpages at fhcrc.org
Fri May 16 00:58:23 GMT 2008


Hi,

I'm part of the team that runs the Bioconductor project

   http://bioconductor.org/

and we've used rsync successfully so far for a lot of different
things in particular for moving the hundreds of packages that we build
and check every day thru our build system pipe (which is made of several
build nodes running different OSes, see our daily build report here:
http://bioconductor.org/checkResults/2.2/bioc-LATEST/).

At the very end of the build pipe, rsync is used again to sync our
public package repository (http://bioconductor.org/packages/2.2/bioc/)
with an internal repository that is behind a firewall.

Until recently, the internal repository was hosted on lamb1, a 64-bit
SUSE LINUX 10.1 system:

   biocadmin at lamb1:~> rsync --version
   rsync  version 2.6.6  protocol version 29
   Copyright (C) 1996-2005 by Andrew Tridgell and others
   <http://rsync.samba.org/>
   Capabilities: 64-bit files, socketpairs, hard links, ACLs, symlinks, batchfiles,
                 inplace, IPv6, 64-bit system inums, 64-bit internal inums, SLP

   rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
   are welcome to redistribute it under certain conditions.  See the GNU
   General Public Licence for details.

and AFAICT we've never observed any file corruption when rsync'ing
between lamb1 and bioconductor.org. rsync was run everyday on lamb1
with the following options:

   rsync --delete -ave ssh SRC USER at HOST:DEST

Recently we've set up a new machine, wilson1, for hosting the internal
package repository. wilson1 is a 64-bit openSUSE 10.3 system:

   biocadmin at wilson1:~> rsync --version
   rsync  version 2.6.9  protocol version 29
   Copyright (C) 1996-2006 by Andrew Tridgell, Wayne Davison, and others.
   <http://rsync.samba.org/>
   Capabilities: 64-bit files, socketpairs, hard links, symlinks,
                 batchfiles, inplace, IPv6, ACLs, xattrs, SLP
                 64-bit system inums, 64-bit internal inums

   rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
   are welcome to redistribute it under certain conditions.  See the GNU
   General Public Licence for details.

Now when we use rsync on wilson1 to synchronize the internal and public
package repositories, we end up having corrupted files on the public
repository (their md5sums differ between local and remote file, but their
sizes and timestamps are exactly the same). On wilson1, we use rsync
exactly the same way as on lamb1 i.e. we do:

   rsync --delete -ave ssh SRC USER at HOST:DEST

The destination machine (bioconductor.org) is a 64-bit SUSE LINUX
Enterprise Server 9 system. It has not changed during our switch
from lamb1 to wilson1 for the source machine.

It seems that the frequency of the corruptions is low but since
the total volume of packages that we produce is high (> 30G,
a few packages are several hundred MB), we end up having a few
corrupted packages on bioconductor.org (9 in total today, most of
them are among the biggest packages we produce i.e. they are >
700MB).

Of course, if I rerun

   rsync --delete -ave ssh SRC USER at HOST:DEST

again, the corrupted files are not detected so nothing happens.

But strangely enough, if I delete the corrupted file by hand and
rerun the above command, then this time the transfer seems to be
OK. But may that's just luck (given that the corruptions seem to
happen randomly). I've only done this manual deletion once and for
1 file only because I want to give some time to our IT guys to look
into this problem.

Any idea what could be going wrong? What kind of extra information
would you need?

Thanks in advance for your help,

H.


More information about the rsync mailing list