Renamed files and directories

N.J. van der Horn (Nico) nico at vanderhorn.nl
Sun Mar 1 17:19:24 GMT 2009



Jamie Lokier schreef:
> N.J. van der Horn (Nico) wrote:
>   
>>> But you need to verify and update the DB contents - which requires
>>> stat on all the files mentioned in the DB.  In other words you might
>>> have to scan everything :-)
>>>  
>>>       
>> This already takes place while Rsync does its job, so it has not to be 
>> done separately.
>>     
>
> Right, but it has to be done in a separate pass if you're to compare
> all files with each other, not just one destination file.  And you
> need all the RAM, too.  It's like the worst case of "rsync -H".
>   
What I tried to point out is that when the DB is updated while it is 
performing its work, it saves pass.
Ofcourse the new run needs to check the DB before definately removing a 
file or directory.
>> Adding a DB to Rsync would give many more advantages, like:
>> - de-duplication (eliminating copies)
>> - alternative to "locate"
>> - filesystem statistics/analysis
>> If the structure is choosen well, it can prove to be very valuable for 
>> other purposes also.
>>     
>
> I vaguely remember a conscious decision not to expand rsync that much.
> Keep the tool simple and stateless.  There are already other tools
> which do the things you've described.
>   
Yes, KISS is good, but this expansion with a DB can be optionally 
decided by a switch as well.
So if you do not want it, just dont select the option.
>> It must be possible to enable/disable checksumming when the
>> timestamp and size are unchanged.  That clever trick is pretty
>> reliable in normal Rsync usage as well and earns a lot of savings.
>> We only do once every while full checksummed Rsyncs to be sure, but
>> see seldom transfers then.
>>     
>
> You could use that trick, but it's more dangerous when you're looking
> through the _whole_ filesystem for a matching timestamp and size, not
> just looking at the corresponding paths of a single file at each end.
>   
At this moment I am using a lightweight cluster to backup 18 servers all 
over Europe.
The total amount of storage is now 4TB for about 12 million files, 
taking les than 3 hours to do.
This would be impossible with tapes lol...

Remember that the only wish I have is to avoid desastrous removal of 
files that are only moved.
I am not claiming that Rsync should cover any purpose, but as 
backup-tool my demand is reasonable.

Regards, Nico

-- 
Behandeld door / Handled by: N.J. van der Horn (Nico)
---
ICT Support Vanderhorn IT-works, www.vanderhorn.nl,
Kamer van Koophandel / Chambre of Commerce 24228233,
Voorstraat 55, 3135 HW Vlaardingen, The Netherlands,
Tel +31 10 2486060, Fax +31 10 2486061




More information about the rsync mailing list