--fuzzy question

Ryan Malayter malayter at gmail.com
Wed May 20 04:12:38 GMT 2009


On Thu, May 14, 2009 at 4:10 AM, Julian Pace Ross <linux at prisma.com.mt> wrote:
> Hi,
> I have a file that changes slightly in size every day and has the timestamp
> appended to it.. for example on the 14th may:
> MybackedUpFileBlabla_200905140219.bak
> This is transferred by rsync to another server.
> The next day that file is deleted and substituted by a new file on the
> sender.. the new file would be named for example (15th May):
> MybackedUpFileBlabla_200905150221.bak
> The new file will be generally slightly larger in size, but the containing
> directory is exactly the same.
> I was hoping to use --fuzzy and --delete-after, but it doesn't seem to be
> speeding up the transfer. I am assuming that this is because I have both a
> change in name AND a change is size/modtime?
> I was looking into the find_fuzzy function, but i'm not sure if there's
> anything I can tweak in there to make this work.

I am using rsync for the exact same purpose, with very similar file
names and it seems to work just fine on 3.0.5 running on both Linux
and Windows (cwrsync).

Some possible causes I've encountered:
o The source files are compressed or encrypted, which will prevent
sync from matching any blocks. gzip includes a special
"rsync-friendly" compression mode, but all other popular forms of
compression prevent rsync from finding matches.
o The source files are very large, and the default rsync block size
for large files prevents matches from being found. You can try forcing
a smaller block size (trading CPU time for bandwidth).
o The source files are some sort of indexed database files. (SQL
Server uses a .bak extension) If you rebuild or refresh database
indexes between your backups, this actually changes every page of the
database, preventing rsync from finding matches. Also, if you use
indexes on non-sequential clustering indexes, even small amounts of
data change can result in updates to nearly every database page.



-- 
RPM


More information about the rsync mailing list