rsync based on checksum only

josephj at main.nc.us josephj at main.nc.us
Fri Jul 6 18:10:46 MDT 2012


Let us know if that ever gets merged into the official releases.  I could
use that feature.  I download a lot of media files and when I normalize
their names, rsync treats them as new files.

At this point, I don't want to build my own rsync.  I haven't learned git
yet and have to be sure that I don't do anything to compromise rsync for
the rest of my system.

Joe

> hello,
>
> a patch could help you in the case of a move or rename of a file :
>
> Patch : --detect-renamed
> (1) match in size & modify-time (plus the basename, if possible)
> (2) or match in size & checksum (when --checksum was also specified) and
> use each match as an alternate basis file to speed up the transfer.
>
> http://gitweb.samba.org/?p=rsync-patches.git;a=blob;f=detect-renamed.diff;h=c3e6e846eab437e56e25e2c334e292996ee84345;hb=master
>
> Patch options : --detect-renamed-lax and --detect-moved
> http://gitweb.samba.org/?p=rsync-patches.git;a=blob;f=detect-renamed-lax.diff;h=1ff593c8f97a97e8970d43ff5a62dfad5abddd75;hb=master
>
>
> Benjamin ANDRE
>
>
>
> 2012/7/5 Matthias Schniedermeyer <ms at citd.de>
>
>> On 05.07.2012 09:26, Yan Seiner wrote:
>> > Is it possible to tell rsync *not* to use file names, date stamps, etc
>> and
>> > only use the checksum for deciding if a file is the same?
>> >
>> > the remote machine "normalizes" a set of file names to remove all
>> > punctuation marks and forces all file names to lower case.  The files
>> > themselves are unchanged.
>> >
>> > --checksum looks promising but it does not say anything about file
>> names:
>> >
>> > -c, --checksum              Skip based on checksum, not mod-time &
>> size
>> >
>> > Can this be done?
>>
>> A workaround comes to mind.
>>
>> MD5/SHA1 (whatever) the files and hardlink them under that name into a
>> (hidden) directory.
>>
>> Then when you rsync with "-H" those hardlinks (All files must be below
>> the start-directory) make sure that rsync only has to delete/create
>> hardlinks and not copy them again after it had copied it the first time.
>>
>> I use a similar method for a bunch of big files i have, i hardlink them
>> into a hidden directory and when i move the files around rsync only
>> deletes/creates hardlinks. When i move the files onto other storage i
>> only need to do "find .z -type f -links 1" to find out which files only
>> have 1 link. Which means all other hardlinks are gone and i can remove
>> that file. ("find .z -type f -links 1 -delete")
>>
>>
>>
>>
>>
>> Bis denn
>>
>> --
>> Real Programmers consider "what you see is what you get" to be just as
>> bad a concept in Text Editors as it is in women. No, the Real Programmer
>> wants a "you asked for it, you got it" text editor -- complicated,
>> cryptic, powerful, unforgiving, dangerous.
>>
>> --
>> Please use reply-all for most replies to avoid omitting the mailing
>> list.
>> To unsubscribe or change options:
>> https://lists.samba.org/mailman/listinfo/rsync
>> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
>>
> --
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html




More information about the rsync mailing list