Feature proposal and implementation plan: --delete-delay

Jim Salter jim at jrssystems.net
Wed Apr 14 23:38:39 GMT 2004

Just a note:

I do something similar to what you describe, using a Perl script to
invoke rsync with the --backup-dir= option.  I back up system drives to
a considerably larger archive volume using the backup-dir= option to
shunt old versions of files (and deleted files) into hierarchies named
after the dates they were originally synch'ed across.  Before the daily
backup runs, though, my Perl script "preens" the backup filesystem by
checking for old archived versions of files that have reached
"retirement age" and then deletes them.

For example, on /backup, there is /backup/data and /backup/archive.  If
on 2004-04-14 I run my synch script and rsync needs to delete
/backup/data/deadfile.txt, it instead moves it using backup-dir= to
/backup/archive/2004-04-14/deadfile.txt.  When I run my synch script
again a month later, it first deletes the /backup/archive/2004-04-14
tree in its entirety before proceeding to the actual rsync'ing.

In actual point of fact, my preening is triggered by drive capacity and
usage levels, not age of archives, and once triggered it eliminates
archives from older to newer until usage is back down to a desired
percentage - but you get the idea.

Jim Salter
JRS Systems

>> Hi folks,
>> One feature I've wanted in rsync is the ability to delete files that 
>> no longer exist in the source *after some specified grace period.*
>> The functionality I'm looking for is a backup system that won't 
>> actually delete files until a week or two after the user does. This 
>> would:
>>    1. Protect against accidental file deletions; the user would have 
>> some time to realize the mistake and retrieve a backup
>>    2. Keep the backup machine tidier, and reflect more closely the 
>> source directory's structure.
>> It seems to me the difficulty is in noticing when the file was "first" 
>> deleted and then keeping track of how much time has elapsed.
>> Here is a proposed method of doing this:
>>    1. When a file is first discovered to be obsolete due to the source 
>> file having been deleted, rename the local file 
>> "originalfilename__RSYNC_YYYYMMDD".
>>    2. Whenever we are asked to delete a file (in delete_one), check 
>> the filename. If it is in the form above, check to see if the required 
>> amount of time has elapsed (via a --delete-delay=NDAYS argument). 
>> Delete as appropriate. If the file isn't in the correct form, the file 
>> only became obsolete just now; rename the file with the current date.
>>    3. These modified filenames would be exposed to the user. (Yes, a 
>> big ugly.)
>> An alternative strategy which would preserve the filenames would be to 
>> have an auxillary database file that stored the timestamps. However, I 
>> can imagine a thousand things that could go wrong here.
>> I also considered encoding the timeout information in the 
>> modified-time/access-time fields, but that seems very hackish and, 
>> again, I can imagine a thousand things that could go horribly wrong.
>> What do you think? I'm willing to do the work, but I'd like to 
>> implement the feature in the best possible way (any improvements?) and 
>> I was wondering (provided the implementation is adequately 
>> robust/clean) if you would be willing to add such a patch to CVS. (Or 
>> is it "philosophically" the incorrect behavior and thus undesireable 
>> regardless of its implementation?)
>> Thanks,
>> Ed

More information about the rsync mailing list