Issue with hard links, please help!

Max Kipness max at assuredata.com
Sun May 14 23:05:52 GMT 2006


> You could of course (right after an rsync run) do a
> "cd newdir; find . -type f -links 1 -print" and then randomly check a
> couple and compare all their attributes such as mtime, permissions to
> the previous dir. (I still recommend using the --link-dest thing over
> using cp -al first.)

Ok, I think I've figured out the problem with this one, although I'm not
exactly sure of the reason. I have now started using --link-dest and
this works great. Here again is the stat screen:

Number of files: 50285
Number of files transferred: 38
Total file size: 16193254538 bytes
Total transferred file size: 4077908049 bytes
Literal data: 86201342 bytes
Matched data: 3989904700 bytes
File list size: 945440
File list generation time: 6.615 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 87436048
Total bytes received: 539014

sent 87436048 bytes  received 539014 bytes  97913.26 bytes/sec
total size is 16193254538  speedup is 184.07

Well, it ends up that there is a Microsoft backup file (a .bkf file)
that is around 4GB in size that is being changed daily.

Now my question (I think the final one) is why the entire file seems to
be transferred even though rsync obviously detects that only a fraction
of the file has changed. Evidently the Literal Data shows 86201342 of
changes which appears correct. Also, since I'm using option
--log-format="%f %l %b", I see on the file in question, the following
results:

SERVER/E$/exchange.bkf 4076087296 86454659

Isn't this stating that the file size is 4076087296, and the changes to
the file are 86454659?

So why is the entire file transferring each day. I'm using the
--no-whole-files option. Here is the rsync command options I used for
the latest test:

rsync /share/ /backup/05-13-2006/ -v --link-dest=/backup/05-12-2006/
--stats --recursive --archive --times --modify-window=1 --delete
--ignore-errors --files-from=/var/www/html/backup/adlist.txt
--exclude-from=/scripts/file-exclude --no-whole-file --log-format="%f %l
%b" 2> errors.log 1> stats.log\

In the previous posts I stated that du showed every incremental
directory to be around 4-5gb in size. This is because each day the
exchange.bkf has some change associated with it, so I guess the file
cannot be linked. So in reality if you have very large files that have
very small changes applied, hard-links really serve no purpose, correct?
And I assume there is nothing else that can be done with these large
files to conserve space?

Thanks
Max


More information about the rsync mailing list