PATCH/RFC: Another stab at the Cygwin hang problem

jw schultz jw at pegasys.ws
Mon Jun 30 21:54:22 EST 2003


Apparently this fixed the problem for Tillman, James.

Could you regenerate the patch with diff -u please?

On Fri, Jun 27, 2003 at 04:16:12PM +0900, Anthony Heading wrote:
> Hi,
> 
> In http://sources.redhat.com/ml/cygwin/2002-09/msg01155.html, I noted that
> the often-observed hangs of rsync under Cygwin were assuaged by a call to
> msleep().
> 
> After upgrading my Cygwin environment to rsync 2.5.6, I'm seeing these
> hangs again, not surprisingly given a CVS entry for main.c notes that
> this kludge was not harmless:
> 
>     Revision 1.162 / (download) - annotate - [select for diffs] ,
> 	Tue Jan 28 05:05:53 2003 UTC (4 months, 4 weeks ago) by dwd
> 
>     Remove the Cygwin msleep(100) before the generator kills the receiver,
>     because it caused the testsuite/unsafe-links test to hang.
> 
> So it seems sensible to attempt something a bit more elegant.
> 
> And the first question is why kill/signals are being used
> being used here at all.
> 
> The illustrative patch below I think effects an equivalent synchronization,
> but does so by queuing a byte into a pipe rather than sending a signal.
> 
> Of course, since it's not currently done this way, I may be overlooking
> something obvious. I can't quite see what though, since in the event
> that an error occurs then exit_cleanup is available to send SIGUSR1
> with extreme prejudice; but if the protocol in fact concludes cleanly
> then there really should be no need for an asynchronous notification?
> 
> Comments sought, meanwhile I'll test the patch a bit...
> 
> Regards
> 
> Anthony
> 
> 
> *** main.c.Orig	Fri Jun 27 15:21:22 2003
> --- main.c	Fri Jun 27 15:30:09 2003
> ***************
> *** 390,395 ****
> --- 390,396 ----
>   	int status=0;
>   	int recv_pipe[2];
>   	int error_pipe[2];
> + 	int cleanup_pipe[2];
>   	extern int preserve_hard_links;
>   	extern int delete_after;
>   	extern int recurse;
> ***************
> *** 416,426 ****
> --- 417,435 ----
>   		exit_cleanup(RERR_SOCKETIO);
>   	}
>     
> + 	if (pipe(cleanup_pipe) < 0) {
> + 		rprintf(FERROR,"cleanup pipe failed in do_recv\n");
> + 		exit_cleanup(RERR_SOCKETIO);
> + 	}
> +   
>   	io_flush();
>   
>   	if ((pid=do_fork()) == 0) {
> + 		char tmp;
> + 
>   		close(recv_pipe[0]);
>   		close(error_pipe[0]);
> + 		close(cleanup_pipe[1]);
>   		if (f_in != f_out) close(f_out);
>   
>   		/* we can't let two processes write to the socket at one time */
> ***************
> *** 436,450 ****
>   		write_int(recv_pipe[1],1);
>   		close(recv_pipe[1]);
>   		io_flush();
> ! 		/* finally we go to sleep until our parent kills us
> ! 		   with a USR2 signal. We sleep for a short time as on
> ! 		   some OSes a signal won't interrupt a sleep! */
> ! 		while (msleep(20))
> ! 			;
>   	}
>   
>   	close(recv_pipe[1]);
>   	close(error_pipe[1]);
>   	if (f_in != f_out) close(f_in);
>   
>   	io_start_buffering(f_out);
> --- 445,465 ----
>   		write_int(recv_pipe[1],1);
>   		close(recv_pipe[1]);
>   		io_flush();
> ! 		do {
> ! 			status = read(cleanup_pipe[0], &tmp, 1);
> ! 		} while (status == -1 && errno == EINTR);
> ! 		if (status != 1) {
> ! 			rprintf(FERROR,"cleanup read returned %d in do_recv\n", status);
> ! 			if (status == -1)
> ! 				rprintf(FERROR,"with errno %d (%s)\n", errno, strerror(errno));
> ! 			_exit(RERR_PARTIAL);
> ! 		}
> ! 		_exit(0);
>   	}
>   
>   	close(recv_pipe[1]);
>   	close(error_pipe[1]);
> + 	close(cleanup_pipe[0]);
>   	if (f_in != f_out) close(f_in);
>   
>   	io_start_buffering(f_out);
> ***************
> *** 462,469 ****
>   	io_flush();
>   
>   	io_set_error_fd(-1);
> ! 	kill(pid, SIGUSR2);
> ! 	wait_process(pid, &status);
>   	return status;
>   }
>   
> --- 477,487 ----
>   	io_flush();
>   
>   	io_set_error_fd(-1);
> ! 	write(cleanup_pipe[1], ".", 1);
> ! 	if (waitpid(pid, &status, 0) != pid) {
> ! 		rprintf(FERROR,"cleanup in do_recv failed\n");
> ! 		exit_cleanup(RERR_SOCKETIO);
> ! 	}
>   	return status;
>   }
>   
> ***************
> *** 867,878 ****
>   	exit_cleanup(RERR_SIGNAL);
>   }
>   
> - static RETSIGTYPE sigusr2_handler(int UNUSED(val)) {
> - 	extern int log_got_error;
> - 	if (log_got_error) _exit(RERR_PARTIAL);
> - 	_exit(0);
> - }
> - 
>   static RETSIGTYPE sigchld_handler(int UNUSED(val)) {
>   #ifdef WNOHANG
>   	int cnt, status;
> --- 885,890 ----
> ***************
> *** 964,970 ****
>   	orig_argv = argv;
>   
>   	signal(SIGUSR1, sigusr1_handler);
> - 	signal(SIGUSR2, sigusr2_handler);
>   	signal(SIGCHLD, sigchld_handler);
>   #ifdef MAINTAINER_MODE
>   	signal(SIGSEGV, rsync_panic_handler);
> --- 976,981 ----
> 
> 
> --
> Anthony Heading
> 
> This communication is for informational purposes only.  It is not intended as
> an offer or solicitation for the purchase or sale of any financial instrument
> or as an official confirmation of any transaction. All market prices, data
> and other information are not warranted as to completeness or accuracy and
> are subject to change without notice. Any comments or statements made herein
> do not necessarily reflect those of J.P. Morgan Chase & Co., its
> subsidiaries and affiliates.
> 
> -- 
> To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
> 

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list