"Invalid file index" failures, suspicious index numbers

Mailing list subscriptions ushootoo at unixhosts.net
Wed Oct 11 11:01:33 GMT 2006


I've trying to do a backup from a remote host to a local machine via  
rsync using the following configuration:

Remote host (Red Hat Enterprise Linux ES3):

$ uname -srmpi
Linux 2.4.21-47.EL i686 athlon i386
$ cat /etc/redhat-release
Red Hat Enterprise Linux ES release 3 (Taroon Update 8)
$ rsync --version
rsync  version 2.6.8  protocol version 29

Local machine (Mac OS X 10.4.8, Intel):

$ uname -v
Darwin Kernel Version 8.8.1: Mon Sep 25 19:42:00 PDT 2006;  
root:xnu-792.13.8.obj~1/RELEASE_I38
$ rsync --version
rsync  version 2.6.8  protocol version 29

The backup is started from the local machine with root privs (using  
sudo), encrypted transport provided by SSH with public-key  
authentication; the invocation looks like this:

sudo rsync -e "ssh -i full_path_to_key_file" -avzx \
            --numeric-ids --delete --progress \
            root at example.com:/ /var/root/backups/example.com/

On the remote host, rsync logs in with root privileges and a forced  
command appears in the authorized keys file; logging shows that this  
is the actual command being sent to the server:

rsync --server --sender -vlogDtprxz --numeric-ids . /

Now, I think the SSH side of things is working fine because dry runs  
(with the -n switch) work fine but as soon as I try to do a real  
backup I always bail with an error like this:

Invalid file index: -1610612736 (count=9406) [sender]
rsync error: protocol incompatibility (code 2) at sender.c(169)  
[sender=2.6.8]
rsync: writefd_unbuffered failed to write 4092 bytes [generator]:  
Broken pipe (32)rsync: connection unexpectedly closed (195901 bytes  
received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c 
(463) [receiver=2.6.8]
rsync error: error in rsync protocol data stream (code 12) at io.c 
(1119) [generator=2.6.8]

Sometimes the numbers are different (-1610612736, 1610612736 or  
524288) but the error is always the same: an "Invalid file index"  
error and rsync subsequently dies.

In a previous post to the list (http://lists.samba.org/archive/rsync/ 
2006-June/015828.html) Wayne wrote this:

> This number is exactly 0x60000000, so another possibility that  
> comes to mind is that the byte-order messed up somehow. Look in  
> byteorder.h and make sure that CAREFUL_ALIGNMENT is being defined  
> on any system that uses most-significant-byte-first ordering.

I'm getting the same number (1610612736), and I also note that 524288  
is a suspicious-looking number as well (0x80000). I don't think there  
is a byte-ordering issue at play here. The configure script correctly  
identifies the build machine as little-endian (i386) and even if I  
override things and set CAREFUL_ALIGNMENT rsync still bails with  
those invalid file index errors.

Some warnings were produced during the build but they all look  
harmless to me. Things like:

rsync.c: In function ‘sig_int’:
rsync.c:231: warning: unused parameter ‘val’

This is because __attribute__((__unused__)) is omitted on the Mac OS  
X builds because of a bug in GCC; incidentally, that bug is no longer  
present in the current version:

$ gcc --version
i686-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build  
5363)

And some other warnings like this:

log.c: In function ‘rwrite’:
log.c:307: warning: passing argument 2 of ‘libiconv’ from  
incompatible pointer type

But that's just complaining because it expects a "const char*" and  
it's being passed a "char *", and it looks harmless enough to me.

"make test" succeeds with no failures. I also tried all this out with  
a copy of rsync built from yesterday's CVS and still the same  
problems...

Any ideas for how I could troubleshoot this further? I've tried  
running the backup with super verbosity (-vvv) at both ends; not sure  
if this will be of any use, but here are some relevant bits of output:

* It transfers the files list like so; this continues on for many  
thousands of lines:

receiving file list ...
server_sender starting pid=8651
[sender] make_file(.,*,2)
recv_file_name(.)
[sender] make_file(lost+found,*,2)
recv_file_name(lost+found)
[sender] make_file(boot,*,2)
recv_file_name(boot)
[sender] make_file(dev,*,2)
recv_file_name(dev)
[sender] make_file(proc,*,2)
recv_file_name(proc)
[sender] make_file(tmp,*,2)
recv_file_name(tmp)
[sender] make_file(var,*,2)
recv_file_name(var)
[sender] make_file(etc,*,2)
recv_file_name(etc)
[sender] make_file(root,*,2)
recv_file_name(root)
[sender] make_file(usr,*,2)
recv_file_name(usr)
[sender] make_file(lib,*,2)
recv_file_name(lib)
[sender] make_file(bin,*,2)
recv_file_name(bin)
[sender] make_file(home,*,2)
recv_file_name(home)
[sender] make_file(initrd,*,2)
recv_file_name(initrd)
[sender] make_file(mnt,*,2)
recv_file_name(mnt)
[sender] make_file(opt,*,2)
recv_file_name(opt)
[sender] make_file(sbin,*,2)
recv_file_name(sbin)
[sender] make_file(misc,*,2)
recv_file_name(misc)
[sender] make_file(nsr,*,2)
recv_file_name(nsr)
[sender] make_file(.autofsck,*,2)
recv_file_name(.autofsck)
[sender] make_file(dev/pts,*,2)
recv_file_name(dev/pts)

* After about 320,000 lines this phase comes to an end and the next  
one starts:

[sender] make_file(nsr/cores/nsrexecd,*,2)
recv_file_name(nsr/cores/nsrexecd)
[sender] make_file(nsr/cores/nsrexecd/.nsr,*,2)
recv_file_name(nsr/cores/nsrexecd/.nsr)
received 161416 names
161416 files to consider
recv_file_list done
get_local_name count=161416 /var/root/backups/example.com/
generator starting pid=7608 count=161416
deleting in .
recv_files(161416) starting
delete_in_dir(.)
send_file_list done
send_files starting
[generator] make_file(..autofsck.RLTEQ5,*,2)
[generator] make_file(bin,*,2)
[generator] make_file(boot,*,2)
[generator] make_file(dev,*,2)
[generator] make_file(etc,*,2)
[generator] make_file(home,*,2)
[generator] make_file(lib,*,2)
deleting ..autofsck.RLTEQ5
delete_in_dir(bin)
[generator] make_file(bin/awk,*,2)
[generator] make_file(bin/bash2,*,2)
[generator] make_file(bin/bsh,*,2)
[generator] make_file(bin/csh,*,2)
[generator] make_file(bin/dnsdomainname,*,2)
[generator] make_file(bin/domainname,*,2)
[generator] make_file(bin/egrep,*,2)
[generator] make_file(bin/ex,*,2)
[generator] make_file(bin/fgrep,*,2)
[generator] make_file(bin/gtar,*,2)
[generator] make_file(bin/nisdomainname,*,2)
[generator] make_file(bin/rbash,*,2)

* This continues up to about the 348,000-line mark and we see "delta- 
transmission enabled":

delete_in_dir(lib)
delta-transmission enabled
recv_generator(.,0)
set modtime of . to (1153412202) Thu Jul 20 18:16:42 2006
recv_generator(.autofsck,1)
recv_generator(bin,2)
recv_generator(bin/arch,3)
recv_generator(bin/ash,4)
recv_generator(bin/ash.static,5)
recv_generator(bin/awk,6)
bin/awk is uptodate
recv_generator(bin/basename,7)
recv_generator(bin/bash,8)
recv_generator(bin/bash2,9)
bin/bash2 is uptodate
recv_generator(bin/bsh,10)
bin/bsh is uptodate

* After a while there are a whole bunch of "mknod" calls; up to this  
point we produce about 350,000 lines of log statements:

recv_generator(dev/hdo19,974)
mknod(dev/hdo19,060660,0x59000013)
set modtime of dev/hdo19 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo19 from 0 to 6
recv_generator(dev/hdo2,975)
mknod(dev/hdo2,060660,0x59000002)
set modtime of dev/hdo2 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo2 from 0 to 6
recv_generator(dev/hdo20,976)

* At this point it appears that the directories and device nodes have  
all been created, so it starts to actually send/receive regular files  
and very soon runs into the "Invalid file index" failure:

send_files(1, /.autofsck)
recv_files(.autofsck)
mknod(dev/hdo20,060660,0x59000014)
.autofsck
set modtime of dev/hdo20 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo20 from 0 to 6
recv_generator(dev/hdo21,977)
send_files(0, /.)
send_files(0, /.)
mknod(dev/hdo21,060660,0x59000015)
send_files(0, /.)
set modtime of dev/hdo21 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo21 from 0 to 6
recv_generator(dev/hdo22,978)
            0 100%    0.00kB/s    0:00:00 (xfer#1, to- 
check=161414/161416)
send_files(0, /.)
mknod(dev/hdo22,060660,0x59000016)
Invalid file index: -1610612736 (count=161416) [sender]
set modtime of dev/hdo22 to (1088100278) Thu Jun 24 20:04:38 2004
rsync error: protocol incompatibility (code 2) at sender.c(169)  
[sender=2.6.8]
_exit_cleanup(code=2, file=sender.c, line=169): about to call exit(2)
set gid of dev/hdo22 from 0 to 6
recv_generator(dev/hdo23,979)
mknod(dev/hdo23,060660,0x59000017)

* Finally we see a bunch more mknod calls (about 3,000 lines' worth):

set modtime of dev/hdo23 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo23 from 0 to 6
recv_generator(dev/hdo24,980)
mknod(dev/hdo24,060660,0x59000018)
set modtime of dev/hdo24 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo24 from 0 to 6
recv_generator(dev/hdo25,981)
mknod(dev/hdo25,060660,0x59000019)

* And we ultimately die like this:

set modtime of dev/ptydd to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/ptydd from 0 to 5
recv_generator(dev/ptyde,1822)
mknod(dev/ptyde,020666,0x20000ee)
set modtime of dev/ptyde to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/ptyde from 0 to 5
rsync: connection unexpectedly closed (3208995 bytes received so far)  
[generator]
rsync error: error in rsync protocol data stream (code 12) at io.c 
(463) [generator=2.6.8]
_exit_cleanup(code=12, file=io.c, line=463): about to call exit(12)

So, not sure if there's anything in there which may explain the  
failure. Please let me know if there's any more information that  
would be useful.



More information about the rsync mailing list