rsync 2.5.6 still hangs

Steve Greenland steveg at moregruel.net
Thu Mar 20 08:12:49 EST 2003


While syncing a directory with about 40 files of various sizes (4KB -
20MB), rsync will hang, in a repeatable manner. (Where "repeatable"
means that I get identical logs with -vvv set, modulo the temp file
names.)

I am using rsync in "server" mode - started by xinetd. Both sides are
rsync 2.5.6. The server is RH Linux 7.0 (yeah, I know...), clients are
Debian unstable (kernel 2.4.19) and AIX 4.3.3. In all cases, both sides
seem to be waiting in select(). If I set a timeout, then both sides log
it and exit. If I just Ctrl-C the client side, then the server never
quits. (Weird, eh? Shouldn't it get an EOF from the socket?).

The command line I'm using is 

rsync -vvv --password-file ../pwfile rsync://user@host.mydom.tld/modname/* . 

If I write a loop to rsync the files one-at-a-time by name, it works
just fine.

If I do the full sync from a client on the same 100Mb LAN as the server,
it works just fine. I get the error when syncing over the internet:
we've got a 1Mb SDSL connection, the AIX client is T1, the other a 384Kb
ADSL.

Once they are synced, I can re-run the all-at-once rsync at it doesn't
time out.

The last few lines from the -vvv output are:

recv_generator(file_29,28)
generating and sending sums for 28
recv_generator(file_30,29)
generating and sending sums for 29
recv_generator(file_31,30)
generating and sending sums for 30
recv_generator(file_32,31)
generating and sending sums for 31
recv_files(file_18)
recv mapped file_18 of size 12851565
file_18
recv_generator(file_33,32)
generating and sending sums for 32
io timeout after 60 seconds - exiting
rsync error: timeout in data send/receive (code 30) at io.c(103)
_exit_cleanup(code=30, file=io.c, line=103): about to call exit(30)
rsync: connection unexpectedly closed (406 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(165)
_exit_cleanup(code=12, file=io.c, line=165): about to call exit(12)

This was with timeout=60, but increasing it or not using it doesn't make
any difference (except the final error message changes, of course.

The 'file_33' is the largest file, about 20M (but only about 200K
difference). I tried doing the same with -vvvv (which shows the individual
checksums), and it had this:

recv_generator(file_33,32)
gen mapped file_33 of size 21308244
generating and sending sums for 32
count=10014 rem=580 n=2128 flength=21308244
chunk[0] offset=0 len=2128 sum1=77f93b94
chunk[1] offset=2128 len=2128 sum1=3ec13ce7
chunk[2] offset=4256 len=2128 sum1=99573c59
chunk[3] offset=6384 len=2128 sum1=6a7735a9
<snip>
chunk[7800] offset=16598400 len=2128 sum1=cc0995f1
chunk[7801] offset=16600528 len=2128 sum1=6c2a8f1b
chunk[7802] offset=16602656 len=2128 sum1=5dd19cb4
chunk[7803] offset=16604784 len=2128 sum1=bf069986
io timeout after 60 seconds - exiting
_exit_cleanup(code=30, file=io.c, line=103): entered
rsync error: timeout in data send/receive (code 30) at io.c(103)
_exit_cleanup(code=30, file=io.c, line=103): about to call exit(30)
rsync: connection unexpectedly closed (406 bytes read so far)
_exit_cleanup(code=12, file=io.c, line=165): entered
rsync error: error in rsync protocol data stream (code 12) at io.c(165)
_exit_cleanup(code=12, file=io.c, line=165): about to call exit(12)

If I make multiple runs with '-vvvv', the logs *do* vary, some of it
probably buffering issues (the way the sending of cheksums interleaves
with the receiving of data), but also the the number of checksums sent
in the last file varies.


Ideas? Since I seem to be able to duplicate the problem at will, I'd be
happy to tryout any fixes. 

Regards,
Steve


-- 
Steve Greenland
    The irony is that Bill Gates claims to be making a stable operating
    system and Linus Torvalds claims to be trying to take over the
    world.       -- seen on the net



More information about the rsync mailing list