Rsyncing a very large directory tree (over 50,000 files)
christianh at edmi.com.au
Thu Jun 15 00:17:49 GMT 2006
> -----Original Message-----
> From: rsync-bounces+christianh=edmi.com.au at lists.samba.org
> [mailto:rsync-bounces+christianh=edmi.com.au at lists.samba.org]
> On Behalf Of Matt McCutchen
> Sent: Thursday, 15 June 2006 4:24 AM
> To: Andrew Hodgson
> Cc: rsync at lists.samba.org
> Subject: Re: Rsyncing a very large directory tree (over 50,000 files)
> On Wed, 2006-06-14 at 19:07 +0100, Andrew Hodgson wrote:
> > Is there anything I need to be aware of before doing this?
> I started
> > the script this morning, but it was still building the file
> list after
> > around 15 minutes. Is it better to do it using several points, then
> > when I have the structure on the other machine I can then
> do the whole
> > tree in one go? Will the complete file list need to be sent across
> > each time I run the program?
> Yes, rsync will send the complete file list each time it
> runs. It seems
> odd to me that building the file list would take 15 minutes;
> when I back
> up the system partition of my computer (300,000 files) rsync takes
> perhaps 5 minutes to build the file list. I don't think using several
> points would be better or worse than doing it all at once, just more
Multiple points just makes it more complicated - trust me on that one. It
does consume some RAM (not massive amounts) though doing everything at once.
Budget 100 bytes per file. I see approx 70-80 megabytes for 1 million files.
I do do two separate syncs so more important data is done first.
I have two syncs nightly over 100Mb LAN. They don't run at the same time.
Both from one Dual Xeon 3.2GHz 4GB RAM to a Dual P3 1GHz 2GB RAM (both
First sync is from hardware RAID5 with 10k SCSI disks to a RAID0 striped
array (2 disks)
960k of files takes ~8000 secs or more than 2 hours to build the file list.
Second is from hardware RAID1 with 15k SCSI disks to the same RAID0 array
400k of files takes ~900 seconds or 15 minutes to build the file list.
So 50k of files in 15 minutes isn't that flash.
It would also appear my RAID 5 is not very fast (it's the same card as the
RAID1). Not sure why that is. When the syncs run, it's 99% idle.
More information about the rsync