rsync of STDIN to a file.
mark_young at hotmail.com
Thu Nov 19 09:28:38 MST 2009
Hi Ryan & Matt,
Thank you both for your replies. I'll reply to you both in one go, starting with you Ryan.
The decision to have tarballs on the remote site is arguably unnecessary. It's like that historically, and does cope with a very large hierachy of small files. There are many different machines backed up in this manner, with daily and monthly filenames to provide a years worth of coverage.
The particular machine I'm looking at generates a 262MB compressed tarball. It's 3.6GB uncompressed. It contains about 8400 files. If I changed it to not compress I'd have disk space issues with the current infrastructure, all 19 tarballs clocks in at about 5GB versus the 68GB needed for uncompressed storage. Another machine's compressed tarball is 4.5GB in size and contains 5.5 million files.
I tried an experiment to see how rsync coped with the tar compressed files versus the uncompressed files. I took a .tgz from last week and rsync'd it with last nights version. rsync achieved the transfer with a 1.61 speed up. Specifically 172MB of the 262MB file was transferred. I then tried the same thing with the uncompressed tar files. rsync achieved the transfer with a 690 speed up. Specifically 5MB for the 3.6GB file was transferred. Clearly if diskspace were not an issue this would be by far the superior option. Obviously I could increase the complexity of the overall solution with compress & uncompress jobs in sync with the backup strategy, but I believe in keeping it as simple as possible when it comes to backups. Less mistakes that way. One particular backup suffers very badly from a slow network line. So rsync's reduced transfer size might make a good improvement there. I have a mixed bag of machines and backups. Currently they all use the same simple approach. I'm keeping that in mind with regards to changing any one of them. It doesn't help a disaster recovery situation to have a mixed approach to restores.
Matt, it's very exciting to think that you can see a way in which rsync might be able to do what I was thinking of.
rdiff is a new one on me. Very interesting and something I'll experiment with. For our strategy of 7 daily backups and 12 monthly backups I guess I'd need to use two rdiff destinations so I could apply the --remove-older-than 1W and --remove-older-than 365D options respectively. It might not suit my current situation as explained above regarding storing a tarball versus a directory hierachy, but it's definitely worth knowing about.
Also I'd not heard of the --rsyncable option for tar. It's not there on my Debian Etch system, nor in my cygwin tar. So I guess it's not in common distributions by default. Still it sounds useful and is something I will bear in mind as something to consider.
Thank you both very much for opening my eyes to other ideas and possibilities.
I look forward to any further replies.
All the best,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rsync