rsync transfers whole content when a new hardlink is created

Martin Scharrer mailinglists at madmarty.de
Wed Dec 1 21:18:43 GMT 2004


Hi,

I detected a silly behaviour of rsync when new hardlinks of already synced 
files are created:

Scenario:
There are a local directory and a equal remote directory created by former run 
of rsync.
Create a hardlink from a already existing file (both inside the local 
directory). 
If this hardlink has a filename with comes before the original filename when 
both are sorted in 
alphabetic order, rsync (with option -H to preserve hardlinks) will upload the 
whole file for the new hardlink
(=new another filename) and set then the original file to a hardlink to the 
new filename.
The "old" file content will be deleted by this, but not until the new one 
(equal!) is transfered.
In a worst case the disk space is running out while this, even if there is 
enough disk space for a correct sync.
(Of course, this happened me. :-) )

Here is an example:

The local directory with a new hardlink to a file:
$ ls -s1i rsync_test/
total 899960
3272008 449980 a_hardlink_to_large_file
3272008 449980 large_file

The remote directory without the new hardlink:
$ ls -s1i fake_remote/
total 449980
4088085 449980 large_file

Rsyncing:
$ rsync -aHPv rsync_test/ fake_remote
building file list ...
3 files to consider
./
a_hardlink_to_large_file
   460324864 100%   10.02MB/s    0:00:43
large_file => a_hardlink_to_large_file

wrote 460381234 bytes  read 40 bytes  10345646.61 bytes/sec
total size is 920649728  speedup is 2.00

but expected was:

building file list ...
4 files to consider
./
a_hardlink_to_large_file => large_file

wrote 176 bytes  read 20 bytes  392.00 bytes/sec
total size is 1380974592  speedup is 7045788.73

When the hardlink filename is "alphabetic higher" than the orignal filename
the result is the expected result above.

So it seems that rsync isn't checking for further hardlinks if the first 
hardlink in alphabetic order isn't
existing on the remote filesystem, but detect the relationship when it is 
processing the next hardlink.

best
Martin Scharrer


More information about the rsync mailing list