"Invalid file index" failures, suspicious index numbers
Mailing list subscriptions
ushootoo at unixhosts.net
Wed Oct 11 11:01:33 GMT 2006
I've trying to do a backup from a remote host to a local machine via
rsync using the following configuration:
Remote host (Red Hat Enterprise Linux ES3):
$ uname -srmpi
Linux 2.4.21-47.EL i686 athlon i386
$ cat /etc/redhat-release
Red Hat Enterprise Linux ES release 3 (Taroon Update 8)
$ rsync --version
rsync version 2.6.8 protocol version 29
Local machine (Mac OS X 10.4.8, Intel):
$ uname -v
Darwin Kernel Version 8.8.1: Mon Sep 25 19:42:00 PDT 2006;
root:xnu-792.13.8.obj~1/RELEASE_I38
$ rsync --version
rsync version 2.6.8 protocol version 29
The backup is started from the local machine with root privs (using
sudo), encrypted transport provided by SSH with public-key
authentication; the invocation looks like this:
sudo rsync -e "ssh -i full_path_to_key_file" -avzx \
--numeric-ids --delete --progress \
root at example.com:/ /var/root/backups/example.com/
On the remote host, rsync logs in with root privileges and a forced
command appears in the authorized keys file; logging shows that this
is the actual command being sent to the server:
rsync --server --sender -vlogDtprxz --numeric-ids . /
Now, I think the SSH side of things is working fine because dry runs
(with the -n switch) work fine but as soon as I try to do a real
backup I always bail with an error like this:
Invalid file index: -1610612736 (count=9406) [sender]
rsync error: protocol incompatibility (code 2) at sender.c(169)
[sender=2.6.8]
rsync: writefd_unbuffered failed to write 4092 bytes [generator]:
Broken pipe (32)rsync: connection unexpectedly closed (195901 bytes
received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c
(463) [receiver=2.6.8]
rsync error: error in rsync protocol data stream (code 12) at io.c
(1119) [generator=2.6.8]
Sometimes the numbers are different (-1610612736, 1610612736 or
524288) but the error is always the same: an "Invalid file index"
error and rsync subsequently dies.
In a previous post to the list (http://lists.samba.org/archive/rsync/
2006-June/015828.html) Wayne wrote this:
> This number is exactly 0x60000000, so another possibility that
> comes to mind is that the byte-order messed up somehow. Look in
> byteorder.h and make sure that CAREFUL_ALIGNMENT is being defined
> on any system that uses most-significant-byte-first ordering.
I'm getting the same number (1610612736), and I also note that 524288
is a suspicious-looking number as well (0x80000). I don't think there
is a byte-ordering issue at play here. The configure script correctly
identifies the build machine as little-endian (i386) and even if I
override things and set CAREFUL_ALIGNMENT rsync still bails with
those invalid file index errors.
Some warnings were produced during the build but they all look
harmless to me. Things like:
rsync.c: In function ‘sig_int’:
rsync.c:231: warning: unused parameter ‘val’
This is because __attribute__((__unused__)) is omitted on the Mac OS
X builds because of a bug in GCC; incidentally, that bug is no longer
present in the current version:
$ gcc --version
i686-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build
5363)
And some other warnings like this:
log.c: In function ‘rwrite’:
log.c:307: warning: passing argument 2 of ‘libiconv’ from
incompatible pointer type
But that's just complaining because it expects a "const char*" and
it's being passed a "char *", and it looks harmless enough to me.
"make test" succeeds with no failures. I also tried all this out with
a copy of rsync built from yesterday's CVS and still the same
problems...
Any ideas for how I could troubleshoot this further? I've tried
running the backup with super verbosity (-vvv) at both ends; not sure
if this will be of any use, but here are some relevant bits of output:
* It transfers the files list like so; this continues on for many
thousands of lines:
receiving file list ...
server_sender starting pid=8651
[sender] make_file(.,*,2)
recv_file_name(.)
[sender] make_file(lost+found,*,2)
recv_file_name(lost+found)
[sender] make_file(boot,*,2)
recv_file_name(boot)
[sender] make_file(dev,*,2)
recv_file_name(dev)
[sender] make_file(proc,*,2)
recv_file_name(proc)
[sender] make_file(tmp,*,2)
recv_file_name(tmp)
[sender] make_file(var,*,2)
recv_file_name(var)
[sender] make_file(etc,*,2)
recv_file_name(etc)
[sender] make_file(root,*,2)
recv_file_name(root)
[sender] make_file(usr,*,2)
recv_file_name(usr)
[sender] make_file(lib,*,2)
recv_file_name(lib)
[sender] make_file(bin,*,2)
recv_file_name(bin)
[sender] make_file(home,*,2)
recv_file_name(home)
[sender] make_file(initrd,*,2)
recv_file_name(initrd)
[sender] make_file(mnt,*,2)
recv_file_name(mnt)
[sender] make_file(opt,*,2)
recv_file_name(opt)
[sender] make_file(sbin,*,2)
recv_file_name(sbin)
[sender] make_file(misc,*,2)
recv_file_name(misc)
[sender] make_file(nsr,*,2)
recv_file_name(nsr)
[sender] make_file(.autofsck,*,2)
recv_file_name(.autofsck)
[sender] make_file(dev/pts,*,2)
recv_file_name(dev/pts)
* After about 320,000 lines this phase comes to an end and the next
one starts:
[sender] make_file(nsr/cores/nsrexecd,*,2)
recv_file_name(nsr/cores/nsrexecd)
[sender] make_file(nsr/cores/nsrexecd/.nsr,*,2)
recv_file_name(nsr/cores/nsrexecd/.nsr)
received 161416 names
161416 files to consider
recv_file_list done
get_local_name count=161416 /var/root/backups/example.com/
generator starting pid=7608 count=161416
deleting in .
recv_files(161416) starting
delete_in_dir(.)
send_file_list done
send_files starting
[generator] make_file(..autofsck.RLTEQ5,*,2)
[generator] make_file(bin,*,2)
[generator] make_file(boot,*,2)
[generator] make_file(dev,*,2)
[generator] make_file(etc,*,2)
[generator] make_file(home,*,2)
[generator] make_file(lib,*,2)
deleting ..autofsck.RLTEQ5
delete_in_dir(bin)
[generator] make_file(bin/awk,*,2)
[generator] make_file(bin/bash2,*,2)
[generator] make_file(bin/bsh,*,2)
[generator] make_file(bin/csh,*,2)
[generator] make_file(bin/dnsdomainname,*,2)
[generator] make_file(bin/domainname,*,2)
[generator] make_file(bin/egrep,*,2)
[generator] make_file(bin/ex,*,2)
[generator] make_file(bin/fgrep,*,2)
[generator] make_file(bin/gtar,*,2)
[generator] make_file(bin/nisdomainname,*,2)
[generator] make_file(bin/rbash,*,2)
* This continues up to about the 348,000-line mark and we see "delta-
transmission enabled":
delete_in_dir(lib)
delta-transmission enabled
recv_generator(.,0)
set modtime of . to (1153412202) Thu Jul 20 18:16:42 2006
recv_generator(.autofsck,1)
recv_generator(bin,2)
recv_generator(bin/arch,3)
recv_generator(bin/ash,4)
recv_generator(bin/ash.static,5)
recv_generator(bin/awk,6)
bin/awk is uptodate
recv_generator(bin/basename,7)
recv_generator(bin/bash,8)
recv_generator(bin/bash2,9)
bin/bash2 is uptodate
recv_generator(bin/bsh,10)
bin/bsh is uptodate
* After a while there are a whole bunch of "mknod" calls; up to this
point we produce about 350,000 lines of log statements:
recv_generator(dev/hdo19,974)
mknod(dev/hdo19,060660,0x59000013)
set modtime of dev/hdo19 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo19 from 0 to 6
recv_generator(dev/hdo2,975)
mknod(dev/hdo2,060660,0x59000002)
set modtime of dev/hdo2 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo2 from 0 to 6
recv_generator(dev/hdo20,976)
* At this point it appears that the directories and device nodes have
all been created, so it starts to actually send/receive regular files
and very soon runs into the "Invalid file index" failure:
send_files(1, /.autofsck)
recv_files(.autofsck)
mknod(dev/hdo20,060660,0x59000014)
.autofsck
set modtime of dev/hdo20 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo20 from 0 to 6
recv_generator(dev/hdo21,977)
send_files(0, /.)
send_files(0, /.)
mknod(dev/hdo21,060660,0x59000015)
send_files(0, /.)
set modtime of dev/hdo21 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo21 from 0 to 6
recv_generator(dev/hdo22,978)
0 100% 0.00kB/s 0:00:00 (xfer#1, to-
check=161414/161416)
send_files(0, /.)
mknod(dev/hdo22,060660,0x59000016)
Invalid file index: -1610612736 (count=161416) [sender]
set modtime of dev/hdo22 to (1088100278) Thu Jun 24 20:04:38 2004
rsync error: protocol incompatibility (code 2) at sender.c(169)
[sender=2.6.8]
_exit_cleanup(code=2, file=sender.c, line=169): about to call exit(2)
set gid of dev/hdo22 from 0 to 6
recv_generator(dev/hdo23,979)
mknod(dev/hdo23,060660,0x59000017)
* Finally we see a bunch more mknod calls (about 3,000 lines' worth):
set modtime of dev/hdo23 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo23 from 0 to 6
recv_generator(dev/hdo24,980)
mknod(dev/hdo24,060660,0x59000018)
set modtime of dev/hdo24 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo24 from 0 to 6
recv_generator(dev/hdo25,981)
mknod(dev/hdo25,060660,0x59000019)
* And we ultimately die like this:
set modtime of dev/ptydd to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/ptydd from 0 to 5
recv_generator(dev/ptyde,1822)
mknod(dev/ptyde,020666,0x20000ee)
set modtime of dev/ptyde to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/ptyde from 0 to 5
rsync: connection unexpectedly closed (3208995 bytes received so far)
[generator]
rsync error: error in rsync protocol data stream (code 12) at io.c
(463) [generator=2.6.8]
_exit_cleanup(code=12, file=io.c, line=463): about to call exit(12)
So, not sure if there's anything in there which may explain the
failure. Please let me know if there's any more information that
would be useful.
More information about the rsync
mailing list