Paul Hass: re 2.5.5 fork

Paul Haas paulh at hamjudo.com
Mon Sep 9 15:09:00 EST 2002


On Sun, 8 Sep 2002, Trevor Marshall wrote:

> Paul,
> Was running 2.5.4, Compiled and upgraded to 2.5.5
>
> No crash this time, but it again failed on a (possibly open) logfile. There
> might be a hint there...
> Ran it again - no crash, same error...


The problem in 2.5.4 was that if fork() failed, rsync would kill many
processes before it exited.  In 2.5.5, if fork()  fails, rsync exits
without killing extra processes.  The fix doesn't change whatever makes
fork() fail.



> Verbose output was:
>
> Trevors:~ # ./make_snapshot.sh
> umount: /dev/hdc1: not mounted
> building file list ... done
> ./
> etc/
> etc/mtab
> etc/ntp.drift
> var/log/
> var/log/httpd/
> var/log/httpd/access.combined.log
> rsync: error writing 4 unbuffered bytes - exiting: Broken pipe
> rsync error: error in rsync protocol data stream (code 12) at io.c(463)
> Trevors:~ #
>
> ---second time snippet-------
> var/log/httpd/access.combined.log
> rsync: error writing 4 unbuffered bytes - exiting: Broken pipe
> rsync error: error in rsync protocol data stream (code 12) at io.c(463)
> ----------------
>
> I am not too comfortable using CRON to kill open rsync processes with the
> patch you suggested. Oh well. I will look at it later tonight.

Don't try that.  That was a suggestion from when I had very little clue as
to what was broken.  Now I have slightly more of a clue.

Today's question, is fork() failing, and if so, why?

Is there more than one rsync running?  In 2.5.4 a failure in another rsync
process could kill your rsync.  I haven't studied the code recently, but I
don't think there are any calls to fork() after it has started transfering
files.

fork() can fail because there you've reached the user limit or system
limit on running processes, or you've run out of memory.  If you run out
of memory, all sorts of things fail, so you'd probably notice.

As the user who runs rsync, type "ulimit -a" to see how many
processes that user is allowed.  For me, the answer is 256.
    bash$ ulimit -a | grep proc
    max user processes       256

Then see how many processes that user running with "ps", I'm running 18
processes:
    bash$ ps  U `whoami`  | wc -l
        18

Then see the total number of running processes with ps
   bash$ ps ax | wc -l
        68
My system is a long way from running out of processes.

I ran into the bug because we had a script that invoked a script that ...
invoked rsync which invoked ssh, for a total of something like 8 processes
per client, and we ran it for 20 clients, we misconfigured ssh so it hung
on all the clients, leaving all 160 processes running, then cron came
along and started a whole new set of 160 processes.  Which didn't work,
because the limit was 255.  Fortunately, it wasn't running as root.

The ridiculous level of script nesting was from an accumulation of
history, and easy to clean up.

> Thanks - at least it doesn't crash all processes now...

> Sincerely
> ..Trevor..
>
> At 11:04 PM 9/8/2002 -0400, you wrote:
> >Looking at http://rsync.samba.org, we see under bug fixes for 2.5.5
> >
> >      Fix situation where failure to fork (e.g. because out of process
> >      slots) would cause rsync to kill all processes owned by the
> >      current user.  Yes, really!  (Paul Haas, Martin Pool)
> >
> >If rsync is running as root when it triggers this bug it will kill all
> >processes.  This is indistinguishable from a system crash.  Processes
> >running as root are allowed to shut down the system.  So this is not a
> >Kernel bug.
> >
> >If you're still running 2.5.4, upgrade to 2.5.5.  If you're running 2.5.5,
> >try replacing the calls to kill() with an fprintf() to a logfile.  Then
> >manually kill any leftover rsync processes.
> >
> >--
> >Paul Haas
> >paulh at hamjudo.com
> >
> >
> >--
> >To unsubscribe or change options:
> http://lists.samba.org/mailman/listinfo/rsync
> >Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
> >
> --
> To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
>







More information about the rsync mailing list