Moved/Renamed Files

Boris Toloknov tlknv at yandex.ru
Fri Jan 4 20:05:29 GMT 2008


Ming Zhang wrote:
> On Fri, 2008-01-04 at 14:12 -0500, Boris Toloknov wrote:
>   
>> Ming Zhang wrote: 
>>     
>>> On Thu, 2008-01-03 at 20:19 -0500, Boris Toloknov wrote:
>>>   
>>>       
>>>> Hi,
>>>> It seems that rsync transfers files whose names was changed or which
>>>> were moved to another directory since the previous synchronization. I
>>>> think that ability not to transfer (large) files which are present on
>>>> another computer would be very helpful. Right before rsync is going to
>>>> transfer some large file it could check if there some other files with
>>>> the same size ( and maybe the same mtime ) on the destination
>>>> computer. In case if the destination computer has such files then it
>>>> could be asked to find the file with given MD5. If it's found then
>>>> there is no need to transfer that file. Local copy/rename/move can be
>>>> performed instead.
>>>>     
>>>>         
>>> let us say you have N files in one directory and you rename the
>>> directory name. so for N files, u need to check destination side all M
>>> files and see if it is the renamed one. so you do NxM comparison and
>>> this is not scalable at all...
>>>   
>>>       
>> I think that a hash could be used instead of that. The destination
>> computer ( at least ) must has a list of all the files in the
>> destination directory. The key = size + mtime and value = pointer to
>> the file entry in the list. Actually for that operation it would be
>> better to have that list and hash on the sending computer.
>>     
>
> rsync 3.0 introduce incremental scan to avoid the OOM issue, so hash
> need to be optional as well... also i think this hash can be used to
> detect hard link at same time. for normal use, it should be ok.
>   
I agree that with incremental scan "move/rename" feature can be 
optional. Anyway to minimize memory usage ( if it's necessary ) a sorted 
list can be used instead of hash and a list of all files could be stored 
in the temporary file with buffered access to it. In that case the key = 
size + mtime, value = offset in the file with the list.

Boris
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the rsync mailing list