silent data corruption with rsync

Leen Besselink leen at
Tue Mar 11 12:05:32 MDT 2014

On Tue, Mar 11, 2014 at 11:52:51AM -0500, Karl O. Pinc wrote:
> On 03/11/2014 11:02:28 AM, Sig Pam wrote:
> > Hi everbody!
> > 
> > I'm currently working in a project which has to copy huge amounts of
> > data from one storage to another. For a reason I cannot validate any
> > longer, there is a roumor that "rsync may silently corrupt data".
> > Personally, I don't believe that.
> > 
> > "They" explain it this way: "rsync does an in-stream data
> > deduplication. It creates a checksum for each data block to transfer,
> > and if a block with the same checksum has already been transferred
> > sooner, this old block will be re-used to save bandwidth. But, for 
> > any
> > reason, two diffent blocks can produce the same checksum even if the
> > source data is not the same, effectively corrupting the data stream".
> Well, yeah.  It works that way if you're transferring data over
> the network.
> The question is: "how often will this problem exhibit itself?"
> The answer is: "Usually, never within the lifetime of the Universe."

If anyone wants a much longer discription of how the rsync algorithm works.

There was a talk at the Ottawa Linux Symposium by Andrew Tridgell:

I found a recording here:

If you prefer reading, there is a transcript on Source Forge in Lyx format:

> You're a lot more likely to have data corruption due to a 
> cosmic ray hitting your box.
> There are some cases where the answer is: "Maybe more often."  The only 
> time I can think of that you'd want to worry about
> is if you're researching MD5
> checksum collisions and have a lot of data on disk that has
> collisions in the checksumming.  In other words,
> if you're actively trying to cause problems it might be an issue.
> (The older rsyncs used MD4.)
> If you're actually _copying_ data rather than backing it up then
> avoid the issue by not using rsync.  Otherwise the tradeoff
> is worth the risk.
> Karl <kop at>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein
> -- 
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options:
> Before posting, read:

More information about the rsync mailing list