help me understand keepalive..

Steve Sether steve at vellmont.com
Fri May 13 11:22:56 GMT 2005


> Maybe.  You must be using at least 2.6.4 on both ends of the connection
> and not overriding the protocol version below 29.  (You can see what
> protocol was negotiated by specifying at least four -v options.)  If the
> remote system has multiple versions of rsync installed, perhaps it is
> running an older one unbeknownst to you.

I double checked, and it's definately 2.6.5pre1 on both sides.  You can
also see this is the provided log file (below).

> 
> Even with keep-alive, if you run something that takes up too much time,
> rsync could conceivably still timeout.  For instance, using --checksum
> with really large files might not get to the keep-alive check often
> enough to make a difference.  Or using --fuzzy with a lot of missing
> files into a large directory could be pretty slow too.  (You don't
> mention what options you're using, so I'm going to stop guessing at
> what might be slowing things down.)
> 

I'm not using any flags like that, only -vvvv on the sender side (previously
I was using only -a, but removed it for the test for simplicity).  The
receiver side isn't the fastest machine in the world, it's a Celeron 566,
but there's also nothing really running on it but rsync.

> In any case, it would be good to know at what point in the transfer it
> was timing out.  You might try setting larger levels of verbosity, and
> if it still times out, let me know at what was going on at the time of
> the failure (perhaps attach strace to the generator process too -- it's
> the first (which usually means lowest) PID of the two processes on the
> receiving side).
> 
Here's my output from rsync on the sender side (with most of the chunk[n] 
stuff trimmed out for brevity).  I kept an strace log of the receiver side 
like you asked, and of the sender side for completeness.  They're both fairly 
large (137k and 1.7megs), so I put them on my webserver.  They should be 
available at:

http://www.vellmont.com/~steve/rsync-strace-receiver.log
http://www.vellmont.com/~steve/rsync-strace-sender.log

The full output from rsync is also available if you need it for some reason at:
http://www.vellmont.com/~steve/rsync-sender.log

opening tcp connection to 192.168.0.4 port 873
opening connection using --server -vvvv . turing
(Client) Protocol versions: remote=29, negotiated=29
[sender] make_file(counter-strike source shared.gcf,*,2)
[sender] i=0 /home/steve counter-strike source shared.gcf mode=0100700 len=1063117944 flags=0
send_file_list done
file list sent
send_files starting
send_files(0, /home/steve/counter-strike source shared.gcf)
count=32611 n=32600 rem=31946
chunk[0] len=32600 offset=0 sum1=77cf1b2f
chunk[1] len=32600 offset=32600 sum1=03fb25a9
chunk[2] len=32600 offset=65200 sum1=bf2738ce
chunk[3] len=32600 offset=97800 sum1=5cff6268
chunk[4] len=32600 offset=130400 sum1=94090fa6
.
.
.
.
chunk[32609] len=32600 offset=1063053400 sum1=a777a26e
chunk[32610] len=31946 offset=1063086000 sum1=ec25d882
send_files mapped /home/steve/counter-strike source shared.gcf of size 1063117944
rsync: writefd_unbuffered failed to write 2 bytes: phase "unknown" [sender]: Broken pipe (32)
rsync: connection unexpectedly closed (125 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(420)
rsync: connection unexpectedly closed (228303 bytes received so far) [sender]
_exit_cleanup(code=12, file=io.c, line=420): entered
rsync error: error in rsync protocol data stream (code 12) at io.c(420)
_exit_cleanup(code=12, file=io.c, line=420): about to call exit(12)




> One potential "fix" you could try is to change the initialization of
> the "lull_mod" value in generator.c to 1.  That would make it call
> maybe_send_keepalive() after every file.
> 
> ..wayne..

I'm assuming that wouldn't help this bug, since my test is only sending one
file.


More information about the rsync mailing list