rsync transfers whole content when a new hardlink is created
Martin Scharrer
mailinglists at madmarty.de
Wed Dec 1 21:18:43 GMT 2004
Hi,
I detected a silly behaviour of rsync when new hardlinks of already synced
files are created:
Scenario:
There are a local directory and a equal remote directory created by former run
of rsync.
Create a hardlink from a already existing file (both inside the local
directory).
If this hardlink has a filename with comes before the original filename when
both are sorted in
alphabetic order, rsync (with option -H to preserve hardlinks) will upload the
whole file for the new hardlink
(=new another filename) and set then the original file to a hardlink to the
new filename.
The "old" file content will be deleted by this, but not until the new one
(equal!) is transfered.
In a worst case the disk space is running out while this, even if there is
enough disk space for a correct sync.
(Of course, this happened me. :-) )
Here is an example:
The local directory with a new hardlink to a file:
$ ls -s1i rsync_test/
total 899960
3272008 449980 a_hardlink_to_large_file
3272008 449980 large_file
The remote directory without the new hardlink:
$ ls -s1i fake_remote/
total 449980
4088085 449980 large_file
Rsyncing:
$ rsync -aHPv rsync_test/ fake_remote
building file list ...
3 files to consider
./
a_hardlink_to_large_file
460324864 100% 10.02MB/s 0:00:43
large_file => a_hardlink_to_large_file
wrote 460381234 bytes read 40 bytes 10345646.61 bytes/sec
total size is 920649728 speedup is 2.00
but expected was:
building file list ...
4 files to consider
./
a_hardlink_to_large_file => large_file
wrote 176 bytes read 20 bytes 392.00 bytes/sec
total size is 1380974592 speedup is 7045788.73
When the hardlink filename is "alphabetic higher" than the orignal filename
the result is the expected result above.
So it seems that rsync isn't checking for further hardlinks if the first
hardlink in alphabetic order isn't
existing on the remote filesystem, but detect the relationship when it is
processing the next hardlink.
best
Martin Scharrer
More information about the rsync
mailing list