rsync hang w/ linux, openssh, latest rsync

Dave Dykstra dwd at bell-labs.com
Wed Oct 31 01:29:33 EST 2001


On Mon, Oct 29, 2001 at 06:22:07PM -0500, Dave Wreski wrote:
> Subject: Re: 2.4.7p1 protocol differences?
> 
> > I'm not aware of any protocol differences.  Can you be more specific?
> > Try to give us some simple steps to reproduce it if you can.
> 
> Actually, I screwed up. The remote side had quite an old version of rsync.
> Upgrading to 2.4.6 fixed the protocol problems but hasn't fixed the hang
> problem.
> 
> It's really strange because it hangs in the same place every time, and
> it's the only instance of rsync from the 30 or so in my shell script that
> cause a problem, with other instances transferring much more data. Other
> instances are also on the same server that work fine.

We have had situations where -v appeared to cause a hang, although I thought
those had been fixed.  Try without that.


> I was almost thinking it was getting stuck on a socket or symlink or
> something. While there aren't any fifos in that directory, there are about
> 50 symlinks..

Try with trial & error to eliminate particular files.  I once found a
single 80 byte file that could hang rcp every time, we assumed because of
some buggy network element.


> > I'm not sure how much data it takes to trigger the bug.  What kinds of
> > hosts, and what transfer method are you using?  Perhaps you'd like to
> > try appling Wayne Davison's no-hang patch on 2.4.6 rathern than using
> > 2.4.7p1, from
> 
> They're all Linux boxes with OpenSSH_2.3.0p1 as the transport on both
> sides, using ssh1. I just tried the patch and it hangs in the same place
> as 2.4.6 stock and 2.4.7p1.

OpenSSH 2.3 is rather old, although I'm not aware of any hanging problems
in it.  Which Linux kernel?  There have been TCP bugs in Linux fixed in
more recent releases.

It often helps on hangs to run netstat on both sides and look at the queues
for the connection to see if there are things in the send queue on one
side but nothing in the receive queue on the other.  If that is true, it's
an operating system bug.

- Dave Dykstra




More information about the rsync mailing list