possible bug?
David S. Ahern
dahern at avaya.com
Fri Feb 13 22:04:21 GMT 2004
A google search on this problem did not show any matches, so I'll take
the chance that someone on this list might consider it an rsync problem.
In a nutshell, if rsync forks a child process to handle the transport
(rsh in this case) it can hang in wait_process() forever waiting for
that child process to die. Normally this would not be a problem.
However, if the wrong packet is dropped (and all of its retries) it is a
problem.
In the particular case that I have been debugging, the TCP FIN packet
from the rsh server is not getting received by the rsh client which
rsync started to send files to a peer. This means that rsh blocks
indefinitely which in turn keeps rsync in the wait_process loop -- even
though all files have been transferred.
Yes, I realize the process that kicks off rsync should ensure that it
terminates in a timely manner. However, I would like to propose a change
to rsync that would let invocations of rsync with a timeout die based on
that timeout: wait_process() could call check_timeout() before calling
msleep(). When the timeout has been exceeded, rsync would call
_exit_cleanup and kill_all would take care of the child. You could make
this check optional so that only calls from client_run do the check.
just a thought.
--
david ahern
More information about the rsync
mailing list