Feature Request - Load Throttling
Marc Perkel
marc at perkel.com
Wed Feb 18 14:50:07 GMT 2004
Paul Haas wrote:
>
>
>It's a real issue, and it isn't specific to rsync. You've got a webserver
>that runs well on your hardware, assuming reasonable disk caching and/or
>disk I/O rates. rsync comes along and reads lots of files, thus clearing
>your cache, and it walks the directory tree, doing reads spread all over
>the disk. The reads spread over the disk means the disk heads are
>spending lots of time seeking back and forth in the famous elevator
>algorithm. Your processes end up spending a lot of time waiting for that
>elevator.
>
>
Thanks for understanding the problem. I have two drives in my server and
I use rsync to keep them in sync. But when it does the IO climbs so high
it practically stops MySQL from running. And - for the record - updatedb
does the same thing.
>I don't think it requires changes to rsync, certainly nothing significant.
>I think you want a separate process to implement your policy. Something
>only a little more complex than this untested perl script:
>
>===============cut here================
>
>#!/usr/bin/perl -w
>$tooHigh = 4; # Max acceptable load average
>$checkTime = 10; # Seconds between checking load average
>$restTime = 60; # Seconds to pause disk hog process when load average high
>@rsyncPids = @ARGV;
>while (1) { # fix this, script should end when the pids exit.
> if ( LoadAvg() > $tooHigh ) {
> PausePids(@rsyncPids);
> sleep(60);
> ResumePids(@rsyncPids);
> }
> sleep(10);
>}
>sub LoadAvg {
> $upString = `uptime`;
> ($loadAvg) = ($upString =~ m/load average: (\d+\.\d+)/);
> return $loadAvg;
>}
>
>sub PausePids {
> $SigStop = 19;
> kill $SigStop, @_;
>}
>sub ResumePids {
> $SigCont = 18;
> kill $SigCont, @_;
>}
>
>===============cut here================
>
>
This is very interesting. I'm not a programmer but it's the start of
what I'm looking for and may be something that I could apply to updatedb
as well. Can you make a few mods to it?
First - I'd like to pass a command line to it with the name of the
program (regex) and have it find the pids. Then - have switched for the
load level, check time, and pause time (-l 4 -c 10 -p 60) and -v for
verbose, and -V for version, and -h for help, maybe -q to make existing
throttles quit. So if it were called "throttle.pl" then the command line
might look like this:
throttle -l 4 -c 10 -p 60 "rsync|updatedb"
Also - is there a variable in /proc you can read for load averages?
Looks to me like this is almost a product!
>If the load average climbs too high, it pauses rsync, or whatever pids you
>asked to pause.
>
>I don't think rsync needs to be changed at all, provided you avoid
>anything involving timeouts.
>
>There are certainly situations where rsync would be the important task,
>and it would be the other disk hog process that should pause.
>
>It's debatable whether I count as a real developer.
>
>
Looks like you're a reeal developer to me!!!
More information about the rsync
mailing list