rsync based on checksum only

Matthias Schniedermeyer ms at citd.de
Thu Jul 5 13:00:27 MDT 2012


On 05.07.2012 09:26, Yan Seiner wrote:
> Is it possible to tell rsync *not* to use file names, date stamps, etc and
> only use the checksum for deciding if a file is the same?
> 
> the remote machine "normalizes" a set of file names to remove all
> punctuation marks and forces all file names to lower case.  The files
> themselves are unchanged.
> 
> --checksum looks promising but it does not say anything about file names:
> 
> -c, --checksum              Skip based on checksum, not mod-time & size
> 
> Can this be done?

A workaround comes to mind.

MD5/SHA1 (whatever) the files and hardlink them under that name into a 
(hidden) directory.

Then when you rsync with "-H" those hardlinks (All files must be below 
the start-directory) make sure that rsync only has to delete/create 
hardlinks and not copy them again after it had copied it the first time.

I use a similar method for a bunch of big files i have, i hardlink them 
into a hidden directory and when i move the files around rsync only 
deletes/creates hardlinks. When i move the files onto other storage i 
only need to do "find .z -type f -links 1" to find out which files only 
have 1 link. Which means all other hardlinks are gone and i can remove 
that file. ("find .z -type f -links 1 -delete")





Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.



More information about the rsync mailing list