wmatthews at sepaton.com
Wed May 26 13:22:32 GMT 2004
Wayne replied to my original note which said that in a special situation that I was using to probe rsync to build a behavioral model that bwlimit= resulted in bimodal behavior around a 4000 kbyte/sec value.
He responded with a patch that I have tested in a limited way. I have a push scenario from a local site to a remote site. I use a file that is 6.3 Megabyte in size whose checksums (when written to batch) are 60k bytes and whose deltas (when written to batch) are 40k. I touch the local file (but do no modifications to the file other than changing its modification time) and then rsync the file with the command string
"time rsync -ar --rsh=rsh --bwlimit=<x> foo.tar remote://test" where foo.tar is in the local directory /test. I have recorded the "real" , "user", and "sys" values to get an idea of the shape of the curve as I vary <x>. I have values for <x> of 8000, 4000, 2000, 1000, 500, 250, 175, 100, 50, 25, 12, 6, 3, 2, 1. The resulting curve for lower values of <x> is a near linear line. At the upper end the "real" value is a constant. There is a knee to the curve between 500 and 100 for this particular file size.
I have done nothing to validate that the peak transmission rate is accurate. It is not important to my original goal.
I would like to express my appreciation to Wayne for the patch. It allows me to do my work.
I have tested the patch against a large file (9.2 Gigabytes) that results in a checksum file of 3.5Meg and a delta file of 2.3Meg and used a bwlimit=1 and tested the "real" time against an expected "real" time and got a comparable result. My expected completion time was 96 seconds based upon 5.8 Meg total traffic and the measured time was 114 seconds. There is only 20 seconds unaccounted for, so the bw throttling may not be 100% on the money but I dont think it is off that much. My intuition is that a large part of the 20 seconds is idle time in the communications pipe caused by processing and scheduling delays, etc.
Wayne's patch is better than the previous implementation.
Wayne's patch insures that the "average" transmission rate will not exceed the bwlimit and my partial testing seems to validate that.
It does not enforce that the instantaneous transmission rate will not exceed the bwlimit.
In my case, I am more interested in what the total network between the local and the remote system will allow so Wayne's patch meets my needs. It may not be exactly accurate but it is close enough for me to create my models.
bwlimit is not main line functionality. It is useful for doing things like I am doing, but how much effort should we put into it?? When rsync is doing what it was designed to do, we need it to run as fast as the network will allow it to run?? Or, is there value in throttling it during normal operation? How accurate does the throttle have to be?
PS - If there is interest in the data I am collecting for my model, I am willing to share that data when I have completed the testing. It will be targeted for my needs (thus incomplete for some others) and it will be based upon the hardware and OS environment of my servers and not necessarily of any value on others.
More information about the rsync