Issue with hard links, please help!

Paul Slootman paul at debian.org
Thu May 11 16:40:47 GMT 2006


On Thu 11 May 2006, Max Kipness wrote:

> [root at backup backup]# cp -al Latest/ mtest/
> [root at backup backup]# du --max-depth=1 -h
> 21G     ./Latest
> 8.7M    ./mtest
> 21G     .
> [root at backup backup]# rm mtest/ -rf
> [root at backup backup]# cp -al Latest/ test/
> [root at backup backup]# du --max-depth=1 -h
> 21G     ./test
> 8.3M    ./Latest
> 21G     .
> 
> The last instance is the problem that happens quite often. Now when I

No, it's not a problem. It's just that now du encounters the "test"
directory before finding the "Latest" directory.  du only counts the
blocks of hardlinked files once, and reports the size under the first
directory such a file is in.

The 8.3M (or 8.7M) is purely the disk blocks needed for the directories
and any symlinks if applicable, it is *not* related to storage of file
contents.

If you run "du -s -h test Latest" (or use the --count-links option) you
will see that each directory is handled separately, and both will have
21G.

> perform an rsync as such:
> 
> rsync /share/ /backup/Latest --stats --recursive --archive --times
> --modify-window=1 --delete --ignore-errors --no-whole-file
> --files-from=/var/www/html/new/var/backup_selections.txt
> --exclude-from=/var/www/html/new/var/file-exclude --progress
> 
> I get the following results:
> 
> Number of files: 53911
> Number of files transferred: 52223
> Total file size: 21654476720 bytes
> Total transferred file size: 21654476720 bytes
> Literal data: 21651840443 bytes
> Matched data: 0 bytes
> File list size: 992872
> Total bytes sent: 21657710607
> Total bytes received: 1044480
> 
> And a du gives me:
> 
> [root at backup backup]# du --max-depth=1 -h
> 21G     ./test
> 21G     ./Latest
> 41G     .
> 
> It appears that due to the cp -al command not working right as stated
> above, the literal changes needed was everything minus the 8.3mb, when
> in reality there were very few changes between 'Share' and 'Latest'.

What's happened is that the files are updated, and the hard link is
lost. Why the files are updated I can't say, it could be due to all
sorts of reasons; perhaps using the --itemize-changes option will help.

Look into the --link-dest option, you can leave out your cp -al pass in
that case.

> Can someone give any guidance on this issue? There are time when this
> will happen several times throughout the 30 day incremental routine so
> the disk requirements are very large. How can I keep all the data in
> 'Latest' consistently after using the cp -al command?

May I suggest the dirvish package, which is a sort of wrapper around
rsync to implement incremental changes? It sounds like what you're
trying to do. http://www.dirvish.org/


Paul Slootman


More information about the rsync mailing list