rsync hang w/ linux, openssh, latest rsync

Dave Dykstra dwd at bell-labs.com
Sat Nov 3 08:19:24 EST 2001


On Thu, Nov 01, 2001 at 11:47:15AM -0500, Dave Wreski wrote:
> 
> > We have had situations where -v appeared to cause a hang, although I thought
> > those had been fixed.  Try without that.
> 
> No difference.
> 
> > Try with trial & error to eliminate particular files.  I once found a
> > single 80 byte file that could hang rcp every time, we assumed because
> > of some buggy network element.
> 
> I'm not sure how to do that since there are so many files in the
> directory. 

Make a copy of your data on the both sides, and try cutting out half
the data on the send side at a time to find the smallest set that
causes it to fail.

> strace shows me both sides are stuck in a select() and the
> sendq and recvq are seemingly empty according to netstat.

Ok, that eliminates one kind of operating system failure.


> I did find this when using --dry-run:
> 
> # rsync --dry-run -avve 'ssh ... -i ..' remote:/path/ /mnt/backup
> ...
> bits/eng/html/cells/cell_220_article.txt
> Invalid file index 1541696587 in recv_files (count=2062)
> unexpected EOF in read_timeout
> 
> What is an invalid file index? I looked at the the that appeared to
> logically come next in the list, even added it to an exclude list, and no
> change.

A file index is the number of a file should never be larger than the total
number of files being transferred.  It sounds like some data is getting
lost or garbled somewhere.


> > OpenSSH 2.3 is rather old, although I'm not aware of any hanging
> > problems in it.  Which Linux kernel?  There have been TCP bugs in
> > Linux fixed in more recent releases.
> 
> It's 2.2.19-ac7 or so and has worked for months without incident, until
> now.

That sounds recent enough, but I do recommend upgrading your OpenSSH.

- Dave Dykstra




More information about the rsync mailing list