Feature Request - Load Throttling

Marc Perkel marc at perkel.com
Wed Feb 18 14:50:07 GMT 2004



Paul Haas wrote:

>
>
>It's a real issue, and it isn't specific to rsync.  You've got a webserver
>that runs well on your hardware, assuming reasonable disk caching and/or
>disk I/O rates.  rsync comes along and reads lots of files, thus clearing
>your cache, and it walks the directory tree, doing reads spread all over
>the disk.  The reads spread over the disk means the disk heads are
>spending lots of time seeking back and forth in the famous elevator
>algorithm.  Your processes end up spending a lot of time waiting for that
>elevator.
>  
>
Thanks for understanding the problem. I have two drives in my server and 
I use rsync to keep them in sync. But when it does the IO climbs so high 
it practically stops MySQL from running. And - for the record - updatedb 
does the same thing.

>I don't think it requires changes to rsync, certainly nothing significant.
>I think you want a separate process to implement your policy.  Something
>only a little more complex than this untested perl script:
>
>===============cut here================
>
>#!/usr/bin/perl -w
>$tooHigh = 4;  # Max acceptable load average
>$checkTime = 10; # Seconds between checking load average
>$restTime = 60;  # Seconds to pause disk hog process when load average high
>@rsyncPids = @ARGV;
>while (1) {  # fix this, script should end when the pids exit.
>  if ( LoadAvg() > $tooHigh ) {
>    PausePids(@rsyncPids);
>    sleep(60);
>    ResumePids(@rsyncPids);
>  }
>  sleep(10);
>}
>sub LoadAvg {
>  $upString = `uptime`;
>  ($loadAvg) = ($upString =~ m/load average: (\d+\.\d+)/);
>  return $loadAvg;
>}
>
>sub PausePids {
>  $SigStop = 19;
>  kill $SigStop, @_;
>}
>sub ResumePids {
>  $SigCont = 18;
>  kill $SigCont, @_;
>}
>
>===============cut here================
>  
>

This is very interesting. I'm not a programmer but it's the start of 
what I'm looking for and may be something that I could apply to updatedb 
as well. Can you make a few mods to it?

First - I'd like to pass a command line to it with the name of the 
program (regex) and have it find the pids. Then - have switched for the 
load level, check time, and pause time (-l 4 -c 10 -p 60) and -v for 
verbose, and -V for version, and -h for help, maybe -q to make existing 
throttles quit. So if it were called "throttle.pl" then the command line 
might look like this:

throttle -l 4 -c 10 -p 60 "rsync|updatedb"

Also - is there a variable in /proc you can read for load averages? 
Looks to me like this is almost a product!

>If the load average climbs too high, it pauses rsync, or whatever pids you
>asked to pause.
>
>I don't think rsync needs to be changed at all, provided you avoid
>anything involving timeouts.
>
>There are certainly situations where rsync would be the important task,
>and it would be the other disk hog process that should pause.
>
>It's debatable whether I count as a real developer.
>  
>

Looks like you're a reeal developer to me!!!



More information about the rsync mailing list