rsync mechanics question
jamie at shareable.org
Thu May 10 02:27:51 GMT 2007
Tom Riley wrote:
> However, the curiosity comes in with my source data taking up 86gigs of
> data on a 100g partition, and as the copy progresses the destination
> drive is reporting 240 gigs of usage.
> So as far as I can tell, rsync is working and the data integrity seems
> good, it's simply taking up 2.5 times the space.
Do you need the -S (--sparse) option?
Omitting this, when some of the source files are sparse, is one reason
files take more space when they are copied on unix in general. If
there are sparse files, this will reduce their size at the destination
to something more reasonable, but I don't know if they'll be exactly
the same size.
Secondly, do you need the -H (--hard-links) option?
Omitting this, when some of the source files are hard linked, would
cause multiple copies of the same file to be created on the destination.
To be sure of a clean copy with -S and -H, I think you need to start
with an empty destination, the first time. This will show you if
those options have helped.
You can check if these options are relevant without actually copying,
using "du" to get number of inodes and number of bytes used on the
source disk, "find . | wc -l" to get the number of inodes
(approximately) that will be created without -H, and "find . -printf
'.+((%s+4095)/4096*4096)\n' | bc -l | tail -n1" (works on Linux
anyway) to get the number of bytes (approximately) that will be
created without -S and -H both.
> This crosses realms of expertise that I'm a bit light on, and am fast
> coming up to speed on. I'm trying to determine if there is some mechanic
> within the rsync process that could account for the used space. James
> mentioned that rsync creates temp files which could account for double
> disk usage, and I'm following up on that.
It only creates one temp file at a time, though, and moves it into
place before starting the next one. So if the largest individual file
is 1G, you'd only expect 1G at most extra during the transfer, and
nothing by the end. It cannot possibly explain taking 2.5 times the
More information about the rsync