rsync: "-c" option clarification

Kevin Korb kmk at sanitarium.net
Fri May 19 22:14:58 UTC 2017


inline...

On 05/19/2017 06:07 PM, steven banville wrote:
> 
> Hi
> 
> This is a very delayed response but thanks very much for your answer, it is appreciated.
> 
> It seems that if you do an rsync a second / subsequent time with the "-c" (--checksum), say for data that has not changed, it would have to generate checksums from the files on disk at both ends, even if the size and timestamps are the same, is this not the case ?  If it is, then it would seem we would be catching a disk write error.

Yes, it checks every file even if the timestamps match.  It even
checksums the files that only exist on one end!

This does not necessarily detect disk errors unless you flush your cache
between runs.  It also wouldn't report catching corruption without
--itemize-changes and your interpretation of that output.  Even then
there can be false positives (gzip and similar will backdate a file when
you compress/decompress even though the compressed version can be
different).

> In the past I had experienced issues with hardware writes failing (network or disk), and although rare, for some critical data it is something of concern; that is what prompted this question.  I don't need this high level of fidelity of most data, just a small subset.
> 
> The use case is:
>   * Create raw data
>   * Move to backup location very reliably.
>   * Delete original data set.

The only time I have seen this kind of problem was when there was bad
RAM being used as disk cache.  The solution there is ECC RAM.

> Thanks again.
> 
> Steven Banville
> Cirina
> 201 Gateway Boulevard, Floor 1
> South San Francisco, CA 94080-7019
> http://cirina.com/
> 
> This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
> 
> ________________________________________
> From: rsync [rsync-bounces at lists.samba.org] on behalf of Kevin Korb via rsync [rsync at lists.samba.org]
> Sent: Thursday, March 23, 2017 1:10 PM
> To: rsync at lists.samba.org
> Subject: Re: rsync: "-c" option clarification
> 
> Before anyone yells at me, yes, you can use rsync's --checksum to detect
> (and fix) files that are incorrect despite having correct timestamps and
> sizes.  This would mean that a previous rsync had been corrupted not the
> current one.  But it is important to note that this would only be
> reported to you if you also use --itemize-changes and what to look for
> (a file with a c but not an s or a t).
> 
> It is also worth noting that single file compression tools (like gzip)
> automatically set the original mtime when compressing or decompressing.
> If you decompress then recompress such a file you can cause a case of a
> file with matching mtime+size but not matching checksum due to gzip's
> metadata even though the uncompressed result is identical.  I would not
> consider this to be a case worth updating the remote copy but I am sure
> someone will disagree.
> 
> On 03/23/2017 03:49 PM, Kevin Korb via rsync wrote:
>> The -c option causes rsync to checksum EVERY file on both ends BEFORE
>> rsync does anything else.  It checksums files that are on only 1 end.
>> It checksums files that are different sizes.  It will not catch a
>> hardware problem preventing rsync from writing a file correctly.
>>
>> On 03/23/2017 03:12 PM, steven banville via rsync wrote:
>>>
>>> Hi
>>>
>>>
>>> I am using "rsync" to send files from a source machine to a remote
>>> machine as one typically does.  I would like to clarify that the "-c"
>>> option will cause the checksum on the receiving end to be created by
>>> reading the already written file and NOT the data stream on the
>>> receiving end.  This would help in catching disk I/O errors if the
>>> checksum is done on the file on disk.
>>>
>>> I understand if the size and (or date?) don't match, the checksum is not
>>> needed on the receiving end.
>>>
>>> I may be missing something but it wasn't entirely clear to me that the
>>> checksum is done based on the file on disk.
>>>
>>> Thanks,
>>> -Steve
>>>
>>>
>>>
>>
>>
>>
> 
> --
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
>         Kevin Korb                      Phone:    (407) 252-6853
>         Systems Administrator           Internet:
>         FutureQuest, Inc.               Kevin at FutureQuest.net  (work)
>         Orlando, Florida                kmk at sanitarium.net (personal)
>         Web page:                       http://www.sanitarium.net/
>         PGP public key available on web site.
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
> 

-- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
	Kevin Korb			Phone:    (407) 252-6853
	Systems Administrator		Internet:
	FutureQuest, Inc.		Kevin at FutureQuest.net  (work)
	Orlando, Florida		kmk at sanitarium.net (personal)
	Web page:			http://www.sanitarium.net/
	PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 224 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/rsync/attachments/20170519/3cc1c5e0/signature.sig>


More information about the rsync mailing list