Test case for hard link failure

jw schultz jw at pegasys.ws
Wed Nov 26 10:46:09 EST 2003


On Tue, Nov 25, 2003 at 03:30:53PM -0800, Pete Wenzel wrote:
> The rsync 2.5.6 TODO file mentions the need for hard link test cases. 
> Here is one in which a linked file is unnecessarily transferred in full.
> 
>   # Setup initial directories
>   mkdir src dest
>   dd if=/dev/zero bs=1024 count=10000 of=src/a 2>/dev/null
>   rsync -a src/. dest/.
>   ln src/a src/b
>   # At this point, a & b exist in src; only a exists in dest.
>   rsync -aHv src/. dest/.
>   building file list ... done
>   ./
>   b => a
>   wrote 78 bytes  read 20 bytes  196.00 bytes/sec
>   total size is 20480000  speedup is 208979.59
> 
> The above is GOOD behavior; only the file metadata was transferred, and 
> the link was made in dest, as expected.
> 
> Now try the failure case:
> 
>   # Setup initial directories
>   rm dest/a
>   # At this point, a & b exist in src; only b exists in dest.
>   rsync -aHv src/. dest/.
>   building file list ... done
>   ./
>   a
>   b => a
>   wrote 10241366 bytes  read 36 bytes  6827601.33 bytes/sec
>   total size is 20480000  speedup is 2.00
> 
> The above is BAD (nonoptimal) behavior; the entire file is transferred, 
> even though it could simply have been linked.  It seems that "a" is 
> transferred before it is determined that a suitable equivalent (linked) 
> file "b" already exists.
> 
> I suspect that this has to do with handling the file list in a sorted 
> order; when the missing filename is encountered first, it is transferred 
> in full.  Not being familiar with the rsync protocol or source code, I 
> can't say whether this should be fixed on the client or server side.

Actually, this is because hardlinks are detected as each
file is considered for transfer.

In order to get transfer-optimal we would have to create the
hardlink table in a seperate loop after the flist sort but
before anything else and add a status field to know whether
any of the links had been transfered.  Then the logic for
dealing with the hardlinks would have to be made much more
complex.  I'm not sure that would be worth the cost in terms
of delay.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list