"Unexplained error code xxx" in rsync-2.5.5

Matthias Kurz mk at baerlap.north.de
Mon May 13 12:59:02 EST 2002


On Mon, May 13, 2002 at 12:35:34PM -0500, Dave Dykstra wrote:
> On Sun, May 12, 2002 at 04:13:51PM +0200, Matthias Kurz wrote:
> > 
> > Hi.
> > 
> > We sometimes/often get such errors. It occures in main.c/client_run().
> > I investigated further and found, that the waitpid() in
> > main.c/wait_process() exits with -1, errno = ECHILD, which means
> > "No children". *status can contain garbage in such cases - but not
> > always. It happens under Solaris-2.5.1/SPARC.
> > It (ECHILD) also happens on other Solaris versions - but there *status
> > seems to be always set to/left at 0, so the error slips through.
> 
> The fix at
>     http://lists.samba.org/pipermail/rsync/2002-February/006371.html
> might work for you.

It seems so. Thanks.

> Martin, can you please followup with Tridge to ask him what problem he
> was trying to solve with the waitpid() call in the sigchld_handler()?
> It's a serious bug.

[...]
> > Another problem in main.c/wait_process(). A call to WEXITSTATUS(*status)
> > must only happen, when WIFEXITED(*status) returns true.
> > Else, WIFSIGNALED(*status) could be true, in which case the process was
> > terminated by a signal. One has to extract the signal number using
> > WTERMSIG(*stat)... and so on. There is also WIFSTOPPED() and
> > WIFCONTINUED().
> 
> I think you're right.  This code was also put in by Tridge, in version
> 2.4.5, cvs revision 1.114 of main.c.
> 
> Does anybody know of some code from a highly ported open source project
> that does this right which we can borrow?

Don't know.
This is what i did:

void wait_process(pid_t pid, int *status)
{
        int wpres, sig;

        while ((wpres=waitpid(pid, status, WNOHANG)) == 0) {
                msleep(20);
                io_flush();
        }

        /* TODO: If the child exited on a signal, then log an
         * appropriate error message.  Perhaps we should also accept a
         * message describing the purpose of the child.  Also indicate
         * this to the caller so that thhey know something went
         * wrong.  */
        if (wpres == -1) {
            if (errno == ECHILD) {
                rprintf(FERROR, "ECHILD ! status = %d\n", *status);
            }
            *status = -(errno+10000);
            return;
        }
        if (WIFEXITED(*status)) {
                *status = WEXITSTATUS(*status);
        } else if (WIFSIGNALED(*status)) {
                sig     = WTERMSIG(*status);
                *status = -(sig+100000);
        } else {
                *status = -(*status+1000000);
        }
}


But i think it's better to print some explanation to FERROR and to
return with "*status = -1", because the "arithmetic" could lead to
other strange errors.


   (mk)

-- 
Matthias Kurz; Fuldastr. 3; D-28199 Bremen; VOICE +49 421 53 600 47
   >> Im prämotorischen Cortex kann jeder ein Held sein. (bdw) <<




More information about the rsync mailing list