Rsyncing a very large directory tree (over 50,000 files)

Christian Hack christianh at
Thu Jun 15 00:17:49 GMT 2006

> -----Original Message-----
> From: at 
> [ at] 
> On Behalf Of Matt McCutchen
> Sent: Thursday, 15 June 2006 4:24 AM
> To: Andrew Hodgson
> Cc: rsync at
> Subject: Re: Rsyncing a very large directory tree (over 50,000 files)
> On Wed, 2006-06-14 at 19:07 +0100, Andrew Hodgson wrote:
> > Is there anything I need to be aware of before doing this?  
> I started
> > the script this morning, but it was still building the file 
> list after
> > around 15 minutes.  Is it better to do it using several points, then
> > when I have the structure on the other machine I can then 
> do the whole
> > tree in one go?  Will the complete file list need to be sent across
> > each time I run the program?
> Yes, rsync will send the complete file list each time it 
> runs.  It seems
> odd to me that building the file list would take 15 minutes; 
> when I back
> up the system partition of my computer (300,000 files) rsync takes
> perhaps 5 minutes to build the file list.  I don't think using several
> points would be better or worse than doing it all at once, just more
> complicated.
> > 

Multiple points just makes it more complicated - trust me on that one. It
does consume some RAM (not massive amounts) though doing everything at once.
Budget 100 bytes per file. I see approx 70-80 megabytes for 1 million files.
I do do two separate syncs so more important data is done first.

I have two syncs nightly over 100Mb LAN. They don't run at the same time.

Both from one Dual Xeon 3.2GHz 4GB RAM to a Dual P3 1GHz 2GB RAM (both
First sync is from hardware RAID5 with 10k SCSI disks to a RAID0 striped
array (2 disks)
960k of files takes ~8000 secs or more than 2 hours to build the file list.

Second is from hardware RAID1 with 15k SCSI disks to the same RAID0 array
400k of files takes ~900 seconds or 15 minutes to build the file list.

So 50k of files in 15 minutes isn't that flash.

It would also appear my RAID 5 is not very fast (it's the same card as the
RAID1). Not sure why that is. When the syncs run, it's 99% idle.


More information about the rsync mailing list