rsync hang, more details [LONG]

Ed Santiago santiago at ascend.com
Tue Dec 18 03:22:14 EST 2001


rsync 2.5.0 still has a bug where it hangs under some circumstances.

The hang is beyond my abilities to track down.  I'll keep trying,
though, but here are details in case they're of use to anyone else:

  - Code configured & built on Solaris 2.5.1.
  - Same binary run on Solaris 2.5.1 (client) and 2.8 (server).
  - Using rsh transport, but also fails with ssh
  - Does not fail with local-local rsync

  - Source directory (on server) is NFS-mounted, from NetApp filer
  - Destination directory (on client) is local (tested NFS, also hangs)

  - Consistently hangs with -vv, never (so far) with -vvv

Included below are three stack traces, one on the server and two
on the client.  This is a pretty consistent feature: The client
and server appear to be deadlocked waiting for each other.

Also attached below are a script for populating a sample hierarchy,
and the rsync invocation.

Backtrace on server:

  #0  0xff218224 in _poll ()
  #1  0xff1cb808 in _select ()
  #2  0x24bec in writefd_unbuffered (fd=1, buf=0xffbe5ed0 ">", len=66)
      at io.c:406
  #3  0x24eac in mplex_write (fd=1, code=62, buf=0x591d8 "\a\020", len=62)
      at io.c:498
  #4  0x24f24 in io_flush () at io.c:518
  #5  0x24940 in readfd (fd=0, buffer=0xffbe7020 "ï\002r\215©/\201R", N=4)
      at io.c:314
  #6  0x24998 in read_int (f=0) at io.c:329
  #7  0x199a4 in send_files (flist=0x574e8, f_out=1, f_in=0) at sender.c:110
  #8  0x1d1e8 in do_server_sender (f_in=0, f_out=1, argc=1, argv=0x56f74)
      at main.c:300
  #9  0x1d708 in start_server (f_in=0, f_out=1, argc=2, argv=0x56f70)
      at main.c:476
  #10 0x1e08c in main (argc=2, argv=0x56f70) at main.c:838

Backtrace #1 on client (the parent):

  #0  0xef5b7904 in _poll ()
  #1  0xef5d3d40 in _select ()
  #2  0x24644 in read_timeout (fd=6, buf=0xeffff348 "ÿÿÿÿ", len=4) at io.c:191
  #3  0x247dc in read_unbuffered (fd=6, buf=0xeffff348 "ÿÿÿÿ", len=4) at io.c:263
  #4  0x24950 in readfd (fd=6, buffer=0xeffff348 "ÿÿÿÿ", N=4) at io.c:316
  #5  0x24998 in read_int (f=6) at io.c:329
  #6  0x184e8 in generate_files (f=5, flist=0x57520, local_name=0x0, f_recv=6)
      at generator.c:471
  #7  0x1d3fc in do_recv (f_in=4, f_out=5, flist=0x57520, local_name=0x0)
      at main.c:379
  #8  0x1d958 in client_run (f_in=4, f_out=5, pid=22226, argc=1, argv=0x56f74)
      at main.c:558
  #9  0x1ddc0 in start_client (argc=1, argv=0x56f74) at main.c:731
  #10 0x1e098 in main (argc=2, argv=0x56f70) at main.c:841

Backtrace #2 on client (child):

  #0  0xef5b7904 in _poll ()
  #1  0xef5d3d40 in _select ()
  #2  0x24644 in read_timeout (fd=4, buf=0xefffe680 "", len=4) at io.c:191
  #3  0x24788 in read_loop (fd=4, buf=0xefffe680 "", len=4) at io.c:242
  #4  0x24824 in read_unbuffered (fd=4, buf=0xefffe680 "", len=4) at io.c:268
  #5  0x24950 in readfd (fd=4, buffer=0xefffe680 "", N=4) at io.c:316
  #6  0x24998 in read_int (f=4) at io.c:329
  #7  0x18eec in recv_files (f_in=4, flist=0x57520, local_name=0x0, f_gen=8)
      at receiver.c:328
  #8  0x1d374 in do_recv (f_in=4, f_out=5, flist=0x57520, local_name=0x0)
      at main.c:357
  #9  0x1d958 in client_run (f_in=4, f_out=5, pid=22226, argc=1, argv=0x56f74)
      at main.c:558
  #10 0x1ddc0 in start_client (argc=1, argv=0x56f74) at main.c:731
  #11 0x1e098 in main (argc=2, argv=0x56f70) at main.c:841

--------

The above rsync compiled from CVS repository on Thursday 13 Dec,
early AM.  I've just now (Mon 17 Dec 08:19 Mountain Time) updated,
rebuilt, and rerun the tests.  Same hang.


The script below can be used to populate a directory hierarchy.  It
creates subdirectories 00 through 99 under "src-test/CVSROOT"
(up to you to mkdir that), then a number of files in each subdir:

-------------- next part --------------
#!/sw/tools/bin/Perl -w

use strict;

# up to caller to do:   mkdir -p src-test/CVSROOT
my $sub = 'src-test/CVSROOT';
-d $sub
  or die "You're cd'ed to the wrong directory\n";

foreach my $i (0..99) {
  my $d = sprintf("%02d", $i);
  mkdir "$sub/$d", 02775;

  foreach my $j (1..99) {
    my $f = "$sub/$d/$j$d";
    open  OUT, '>', $f;
    print OUT $f, "\n";
    close OUT or die "error writing $f: $!\n";
  }
}
-------------- next part --------------
This is the rsync invocation:

-------------- next part --------------
#!/bin/sh

CMD=/home/santiago/src/rsync/rsync/rsync.solaris

$CMD	-z -avv --stats --delete \
	--rsync-path=$CMD.nopur					\
	--timeout=600						\
	"cvsroot.eng.ascend.com:/home/santiago/tmp/rsync-test/src-test/CVSROOT" ./results
-------------- next part --------------
Thanks in advance for any help,
^E
-- 
Ed Santiago                 Toolsmith                 santiago at ascend.com



More information about the rsync mailing list