efficiency issue with rsync.....

Tue Jun 17 22:46:41 EST 2003

On Tue, Jun 17, 2003 at 02:49:09PM +0200, Rogier Wolff wrote:
> 
> Hi rsync team, 
> 
> I thought that rsync would try to overlap computing and IO on both
> machines. 
> 
> I'm rsyncing a large tree (18G) and am keeping an eye on that. 
> Suddenly my side completely stopped. No IO visible, no CPU
> time spent. The otherside was doing 100% CPU. Then the other
> side started to do disk IO. Then suddenly the activities moved
> over to my side, and I saw things moving again in the "-v --progress"
> output. 

Rsync tends to be mostly i/o bound.  Especially with fast
CPUs.  In most cases the network i/o is the main limitation
although if little changes the disk will be the limit.
For the most part it will chug along with data in the pipe
almost all the time unless you have a fast network
connection.

> The transfer happened to hit my "spam" mailbox, 400Mb of mostly-stable
> data. I probably rewrote the mailbox yesterday, having changed some
> flags on mails somewhere in the middle. 

This indeed would produce the effect you describe. 400MB
(400Mb == 50MB) would take a while to generate the block
sums and then another larger chunk of time to match those
with rolling checksums.  Your description makes it sound
like you were sending, the 100% CPU interval it was probably
hashing the block checksums prior to doing the rolling
checksum match (disk i/o).

An upgrade to CVS will probably perform a little better due
to larger block sizes on large files.

I find i prefer using the maildir format for large or active
mailboxes.  Mutt and procmail both work well with a mixture
of maildir and mbox.  Maildir is advantageous with dirvish's
link-dest style backups, and with unison for offline
mail reading.  The drawbacks are fragmentation and more files
to compare.

> Note: This is NOT a request: "Please go and fix". Just an "I noticed
> that it might be possible to make it more efficient". Feel free to 
> keep this in mind when performing further development. 
> 
> (Still rsync is a very good tool, transferring the 17G of data in 
> a couple of hours....)
> 
> Oh, another thing: When two files are almost the same, (e.g. I just
> added something to the end of a mailbox) the bandwidth of the link
> is not fully used while the counters are running quickly. Is this
> unavoidable "the machine simply won't generate enough work to
> keep the link busy" or is there a bug in the "limit bandwidht to XX"
> code? (I limit to 30k per second, and I see rsync doing lots of small 
> sleeps. If you try to sleep for 100 usec, you'll actually be woken up 
> by the kernel after a whopping 20 msec.)

If you are only updating 10% of the file your receiver's
disks would have to be 20 to 29 times as fast as the network
link to keep the network link busy.  

The bwlimit function is very crude.  It simply sleeps a
certain amount for every KB of data transferred.  No regard
for processing or transmission time.  It's real value is to
allow you to control the impact on other traffic.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt