silent data corruption with rsync
Kevin Korb
kmk at sanitarium.net
Tue Mar 11 11:13:41 MDT 2014
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I have actually witnessed rsync silently corrupting data. But it
wasn't rsync's fault. I had a bad RAM DIMM that was corrupting the
part of RAM being used as the disk cache. Now I always get ECC RAM.
On 03/11/2014 12:52 PM, Karl O. Pinc wrote:
> On 03/11/2014 11:02:28 AM, Sig Pam wrote:
>> Hi everbody!
>>
>> I'm currently working in a project which has to copy huge amounts
>> of data from one storage to another. For a reason I cannot
>> validate any longer, there is a roumor that "rsync may silently
>> corrupt data". Personally, I don't believe that.
>>
>> "They" explain it this way: "rsync does an in-stream data
>> deduplication. It creates a checksum for each data block to
>> transfer, and if a block with the same checksum has already been
>> transferred sooner, this old block will be re-used to save
>> bandwidth. But, for any reason, two diffent blocks can produce
>> the same checksum even if the source data is not the same,
>> effectively corrupting the data stream".
>
> Well, yeah. It works that way if you're transferring data over the
> network.
>
> The question is: "how often will this problem exhibit itself?" The
> answer is: "Usually, never within the lifetime of the Universe."
>
> You're a lot more likely to have data corruption due to a cosmic
> ray hitting your box.
>
> There are some cases where the answer is: "Maybe more often." The
> only time I can think of that you'd want to worry about is if
> you're researching MD5 checksum collisions and have a lot of data
> on disk that has collisions in the checksumming. In other words,
> if you're actively trying to cause problems it might be an issue.
>
> (The older rsyncs used MD4.)
>
> If you're actually _copying_ data rather than backing it up then
> avoid the issue by not using rsync. Otherwise the tradeoff is
> worth the risk.
>
> Karl <kop at meme.com> Free Software: "You don't pay back, you pay
> forward." -- Robert A. Heinlein
>
- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb Phone: (407) 252-6853
Systems Administrator Internet:
FutureQuest, Inc. Kevin at FutureQuest.net (work)
Orlando, Florida kmk at sanitarium.net (personal)
Web page: http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlMfREUACgkQVKC1jlbQAQeiCACeJnNn9yozItEejG6dWYbp18nS
wqQAnRb+wsJFffyPfOVxIGynlpJVYb5t
=G4Er
-----END PGP SIGNATURE-----
More information about the rsync
mailing list