george at galis.org
Fri Jun 22 19:33:31 GMT 2007
On Tue, Jun 05, 2007 at 11:11:27AM -0700, Chuck Wolber wrote:
>On Tue, 5 Jun 2007, Paul Slootman wrote:
>> > In any case, what's the general consensus behind using the
>> > --hard-links option on large (100GB and above) images? Does it still
>> > use a ton of memory? Or has that situation been alleviated?
>> The size of the filesystem isn't relevant, the number of hard-linked
>> files is. It still uses a certain amount of memory for each hard-linked
>> file, but the situation is a lot better than with earlier rsync
>> versions. (As always, make sure you use the newest version.)
>In our case, we store images as hardlinks and would like an easy way to
>migrate images from one backup server to another. We currently do it with
>a script that does a combination of rsync'ing and cp -al. Our layout is
>| -- img1
>| -- img2 (~99% hardlinked to img1)
>| -- img3 (~99% hardlinked to img2)
>` -- imgN (~99% hardlinked to img(N-1))
>Each image in image_dir is hundreds of thousands of files. It seems to me
>that even a small amount of memory for each hardlinked file is going to
>clobber even the most stout of machines (at least by 2007 standards) if I
>tried a wholesale rsync of image_dir using --hard-links. No?
>If so, then is a "hard link rich environment" an assumption that can be
>used to make an optimization of some sort?
I had a C program which would scan directory points and on some
criteria, (I forget exactly, size and mtime?), it would decide to
unlink one file and link the name to the other. I could look for
it but no guarantees I'll find it, or soon... it was designed for
identical files with different names.
you could tar transfer then minimize with the program. of course
everyone on this list would prefer to use rsync, maybe the
algorithm could be integrated in? :) maybe I can find the code.
it was written by a very senior individual...
George Georgalis, information systems scientist <IXOYE><
More information about the rsync