silent data corruption with rsync

Kevin Korb kmk at sanitarium.net
Tue Mar 11 11:13:41 MDT 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have actually witnessed rsync silently corrupting data.  But it
wasn't rsync's fault.  I had a bad RAM DIMM that was corrupting the
part of RAM being used as the disk cache.  Now I always get ECC RAM.

On 03/11/2014 12:52 PM, Karl O. Pinc wrote:
> On 03/11/2014 11:02:28 AM, Sig Pam wrote:
>> Hi everbody!
>> 
>> I'm currently working in a project which has to copy huge amounts
>> of data from one storage to another. For a reason I cannot
>> validate any longer, there is a roumor that "rsync may silently
>> corrupt data". Personally, I don't believe that.
>> 
>> "They" explain it this way: "rsync does an in-stream data 
>> deduplication. It creates a checksum for each data block to
>> transfer, and if a block with the same checksum has already been
>> transferred sooner, this old block will be re-used to save
>> bandwidth. But, for any reason, two diffent blocks can produce
>> the same checksum even if the source data is not the same,
>> effectively corrupting the data stream".
> 
> Well, yeah.  It works that way if you're transferring data over the
> network.
> 
> The question is: "how often will this problem exhibit itself?" The
> answer is: "Usually, never within the lifetime of the Universe."
> 
> You're a lot more likely to have data corruption due to a cosmic
> ray hitting your box.
> 
> There are some cases where the answer is: "Maybe more often."  The
> only time I can think of that you'd want to worry about is if
> you're researching MD5 checksum collisions and have a lot of data
> on disk that has collisions in the checksumming.  In other words, 
> if you're actively trying to cause problems it might be an issue.
> 
> (The older rsyncs used MD4.)
> 
> If you're actually _copying_ data rather than backing it up then 
> avoid the issue by not using rsync.  Otherwise the tradeoff is
> worth the risk.
> 
> Karl <kop at meme.com> Free Software:  "You don't pay back, you pay
> forward." -- Robert A. Heinlein
> 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
	Kevin Korb			Phone:    (407) 252-6853
	Systems Administrator		Internet:
	FutureQuest, Inc.		Kevin at FutureQuest.net  (work)
	Orlando, Florida		kmk at sanitarium.net (personal)
	Web page:			http://www.sanitarium.net/
	PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlMfREUACgkQVKC1jlbQAQeiCACeJnNn9yozItEejG6dWYbp18nS
wqQAnRb+wsJFffyPfOVxIGynlpJVYb5t
=G4Er
-----END PGP SIGNATURE-----


More information about the rsync mailing list