silent data corruption with rsync

devzero at devzero at
Thu Mar 13 13:40:49 MDT 2014

What do "They" recommend instead?

If it`s all about copying and network bandwidth is not an issue, you can use scp or whatever dumb tool which just shuffle the bits around "as is".  rsync is being used when you want to keep data in sync and if you want to save bandwidth to handle that task. You CAN use it for copying only, but you somewhat take a sledgehammer to crack a nut.

Anyway, if "They" care about their data , "They" use checksumming for storing their data on disk, do "They" ? ;)

The network is not the only place where data corruption can happen....and silent bitrot on disks _does_ happen, especially when your harddisks go nuts and/or your raid arrays break or your storage controller`s firmware got hiccups. It does not happen often, but it happens and mostly you won`t know when and where. In my IT job i had one case were some SAN storage lost some cache contents and the only place we really knew where data loss/curruption has happend were the oracle and exchange databases. For all the other data, we don`t know if they are in 100% perfect condition.


>List:       rsync
>Subject:    silent data corruption with rsync
>From:       Sig_Pam <spam () itserv ! de>
>Date:       2014-03-11 16:02:28
>Message-ID: zarafa.531f3394.439c.5f8c77014439296d () exchange64 ! corp ! itserv ! de
>[Download message RAW]
>[Attachment #2 (multipart/alternative)]
>Hi everbody!
>I'm currently working in a project which has to copy huge amounts of data from one \
>storage to another. For a reason I cannot validate any longer, there is a roumor that \
>"rsync may silently corrupt data". Personally, I don't believe that.
>"They" explain it this way: "rsync does an in-stream data deduplication. It creates a \
>checksum for each data block to transfer, and if a block with the same checksum has \
>already been transferred sooner, this old block will be re-used to save bandwidth. \
>But, for any reason, two diffent blocks can produce the same checksum even if the \
>source data is not the same, effectively corrupting the data stream".
>Did you ever hear something like this? Has this been a bug in any early version of \
>rsync? If so, when was it fixed?
>Thank you,
>Â  sig 

Angaben gemäß §35a GmbH-Gesetz:
ITServ GmbH
Sitz der Gesellschaft: 55294 Bodenheim/Rhein
Eingetragen unter Registernummer HRB 41668 beim Amtsgericht Mainz
Vertretungsberechtiger Geschäftsführer: Peter Bauer, 55294 Bodenheim
Umsatzsteuer-ID: DE182270475

More information about the rsync mailing list