Compression error (Re: rsync hangs during transfer)

Matt McCutchen matt at mattmccutchen.net
Sun Dec 13 21:47:23 MST 2009


On Fri, 2009-12-11 at 08:58 +0100, Dimitar Dimitrov wrote:
> I am running the following rsync command to synchronize directories
> between two servers: 
> rsync -axvz --delete-after -e ssh /SRCDIR/ blabla at DEST:/DESTDIR
> 
> The transfer starts and after a short while it appears to hang after
> some files have been transferred. The process establishes connection
> on both sides so I did an strace from the remote machine (using the
> rsync-debug script as described in the troubleshooting procedures).
> The strace output is here
> http://www.adrive.com/public/142cbf351c4b73a47c6e54ec3302b856041957d612ca68f440d16eccf950225c.html 
> 
> rsync --version on both machines shows 'rsync  version 2.6.9  protocol
> version 29' and both machines are running Debian.
> 
> I would appreciate some ideas on where to look next.

I don't know about the hang.  I see rsync working for a few minutes and
then exiting when the receiver encounters an error:

write(1, "\34\0\0\10inflate (token) returned -5\n", 32) = 32

This is the "compression error" that has occasionally come up before:

http://lists.samba.org/archive/rsync/2004-December/011119.html
http://lists.samba.org/archive/rsync/2006-May/015621.html
http://lists.samba.org/archive/rsync/2008-September/021668.html
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=528730

And after some study, I think I figured out why it is happening.  The
error message comes from "see_deflate_token" in token.c, and the code -5
is Z_BUF_ERROR, meaning that the inflate call didn't make progress.  But
a CHUNK_SIZE output buffer is always provided, and input is provided
unless len == 0 at the start of the loop.  If len == 0, the loop would
have exited unless the previous "inflate" filled the output buffer, in
which case we want to call it again to obtain any remaining output.  But
if the data block was exactly CHUNK_SIZE (32816), it would fill the
output buffer with nothing remaining, and the next call to "inflate"
would return Z_BUF_ERROR.

This case is in fact mentioned in the zlib FAQ
(http://www.zlib.net/zlib_faq.html#faq05): "A Z_BUF_ERROR may in fact be
unavoidable depending on how the functions are used, since it is not
possible to tell whether or not there is more output pending when
strm.avail_out returns with zero."

The block size is indeed 32816, as one can see in the second 32-bit
field of the sum head.  The sum head is shown in the following line
after two 6-byte itemizations:

[pid  5213] write(1, "\252\26\0\0\10\0\262\26\0\0\f\2002\200\0\0000\200\0\0\3"..., 4092) = 4092

Here's a simple script to reproduce the problem:

#!/bin/bash
head -c 32816 /dev/zero >srcfile
cp srcfile destfile
rsync -I -z --no-whole-file --block-size=32816 srcfile destfile

So, we need to do something to "see_deflate_token".  It would probably
work to ignore the Z_BUF_ERROR and let the loop exit because the output
buffer wasn't filled.

It seems inconsistent that the sender uses Z_INSERT_ONLY while the
receiver uses this hack of synthesizing part of the compressed stream.
(Previously, I had imagined Z_INSERT_ONLY worked on both sides.)

-- 
Matt



More information about the rsync mailing list