SIGCHLD SIG_IGN, then wait - warning messages
jw schultz
jw at pegasys.ws
Fri Jul 25 06:11:12 EST 2003
On Thu, Jul 24, 2003 at 09:09:48AM -0400, Hardy Merrill wrote:
> What is the status on this? Is it being considered, or just
> ignored? Some (any) response would be appreciated :)
> If access to the bug is an issue, I can attach it - just let
> me know.
>
> Thanks.
>
> --
> Hardy Merrill
> Red Hat, Inc.
>
> Hardy Merrill [hmerrill at redhat.com] wrote:
> > Rsync maintainers please review rsync bug
> >
> > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=98740
> >
> > The code in question is in socket.c in start_accept_loop.
> > The user is getting these warning messages:
> >
> > -----------------------------
> > Jul 8 12:56:59 ftp kernel: application bug: rsync(1051) has SIGCHLD set to
> > SIG_IGN but calls wait().
> > Jul 8 12:56:59 ftp kernel: (see the NOTES section of 'man 2 wait'). Workaround
> > activated.
> > -----------------------------
> >
> > I was able to reproduce that error with rsync server and
> > client on Red Hat OS's on a fairly consistent basis by
> > running an rsync server (rsync version 2.5.6 protocol
> > version 27 from cvs 7/14) from the command line
> > (rsync --daemon), and connecting with an rsync-2.4.6
> > client with this command:
> > rsync -avv my.server.com::test/one /tmp/rsync_test
> >
> > I also produced the same error by using the same command
> > from an rsync-2.5.6 protocol version 26 client.
> >
> > Here is my /etc/rsyncd.conf file:
> > -----------------------------
> > log file = /var/log/rsync
> > [test]
> > uid = rsync1
> > gid = rsync1
> > path = /rsync_test
> > comment = Rsync Test - Comment
> > -----------------------------
> >
> > Quoting one of the bug messages,
> > "These messages are a warning that rsync is not
> > standards compliant with respect to its handling
> > of child processes. According to POSIX (3.3.1.3)
> > it is unspecified what happens when SIGCHLD is set
> > to SIG_IGN."
My wait(2) manpage reads:
| NOTES
| The Single Unix Specification describes a flag
| SA_NOCLDWAIT (not supported under Linux) such that if
| either this flag is set, or the action for SIGCHLD is set to
| SIG_IGN (which, by the way, is not allowed by POSIX), then
| children that exit do not become zombies and a call to
| wait() or waitpid() will block until all children have
| exited, and then fail with errno set to ECHILD.
This looks like the behaviour is defined in the manpage it is
only POSIX that has a problem with it. In fact the NOTES
indicate that the Linux kernel is not POSIX compliant on
this point.
POSIX is obsolete as a standard, if it could ever be said to
have qualified as one. Please refer to SUSv3 for any issues
regarding standards compliance.
Per SUSv3 and IEEE Std 1003.1, 2003 Edition
regarding setting SIGCHLD to SIG_IGN:
| If the action for the SIGCHLD signal is set to SIG_IGN,
| child processes of the calling processes shall not be
| transformed into zombie processes when they terminate.
and regarding calling wait under those conditions:
| If the calling process has SA_NOCLDWAIT set or has SIGCHLD
| set to SIG_IGN, and the process has no unwaited-for children
| that were transformed into zombie processes, the calling
| thread shall block until all of the children of the process
| containing the calling thread terminate, and wait() and
| waitpid() shall fail and set errno to [ECHILD]
> >
> > This code in socket.c:
> > -----------------------------
> > signal(SIGCHLD, SIG_IGN);
> >
> > /* we shouldn't have any children left hanging around
> > but I have had reports that on Digital Unix zombies
> > are produced, so this ensures that they are reaped */
> > #ifdef WNOHANG
> > while (waitpid(-1, NULL, WNOHANG) > 0);
> > #endif
> > -----------------------------
> >
> > sets SIGCHLD to SIG_IGN, and *then* waits if WNOHANG is
> > defined. It would appear that if there are still children
> > left to be cleaned up in Digital Unix after setting SIGCHLD
> > to SIG_IGN, that there might be a bug in Digital Unix that
> > rsync is coding a fix for.
This code is within the scope of compliance for current
standards. If you have a specific patch that reorders these
and still fixes the Digital UNIX bug i'll consider it but i
this is not a priority.
> >
> > One coding change could be to change the #ifdef WNOHANG to
> > a #ifdef <Digital Unix string>, so that only Digital Unix
Absolutely not. The conditional compilation is behaviour
based not OS based.
> > OS's would include and execute the waitpid. This would
> > stop warning messages from being produced.
> >
> > Thanks.
> >
> > --
> > Hardy Merrill
> > Red Hat, Inc.
I'd suggest this is a kernel or library bug. What kernel
version is this? My 2.4.18 kernel doesn't do this, does the
mainline kernel or just a RH one?
I'll let you update the bug (i did look at it) to indicate
that 1. this message is not an error message, and 2. The
warning is regarding obsolete POSIX compliance and does not
accurately reflect current standards.
--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: jw at pegasys.ws
Remember Cernan and Schmitt
More information about the rsync
mailing list