rsync stalls -- sleeps indefinetly

Yaroslav Halchenko yoh at psychology.rutgers.edu
Tue Oct 3 14:25:45 GMT 2006


Dear Developers

First of all I would love to thank you for a nice tool

I had been using rsync within backuppc project to backup my remote
hosts. It had been working fine until the moment (as I think) whenever I
moved RAID to another box and now the source directory
(/raid/research) is now actually NFS mounted. Now backup (rsync
process) stalls and sleeps indefinitely

On the backup server:

backuppc 28878  0.0  0.3   8160  3724 ?        S    Sep12   0:13   /usr/bin/perl /usr/share/backuppc/bin/BackupPC -d
backuppc 28879  0.0  0.7  11388  7736 ?        S    Sep12   9:31     /usr/bin/perl /usr/share/backuppc/bin/BackupPC_trashClean
backuppc 25861  0.0  6.1 1212496 63864 ?       S    Sep20   3:35     /usr/bin/perl /usr/share/backuppc/bin/BackupPC_dump ravana
backuppc 25908  0.0  0.0   4260   488 ?        S    Sep20   0:16       /usr/bin/ssh -q -x -l backuppc ravana nice -n 10 sudo /usr/bin
/rsync --server --sender --numeric-ids --perms --owner --group --devices --links --times --block-size=2048 --recursive -D -v --one-fi
le-system --exclude /amd --exclude /dev --exclude /tmp --exclude /mnt --exclude /proc --exclude /sys --exclude \*.avi --exclude \*.de
b --exclude \*.mp3 --exclude \*subj\*/epi/svm.\* --exclude \*/\(neurosci\|chemistry\|campus\)/\*/\*data --exclude \*/\(#\|\[nN\]oback
up_\)\* --ignore-times . /raid/research/

and on the data source:

root      9933  0.0  0.0   7748  2304 ?        Ss   Sep20   0:00     sshd: backuppc [priv]
backuppc  9936  0.0  0.0   7912  1600 ?        S    Sep20   0:19       sshd: backuppc at notty
root      9939  0.0 10.0 853464 825536 ?       SNs  Sep20  13:55         /usr/bin/rsync --server --sender --numeric-ids
 --perms --owner --group --devices --links --times --block-size=2048 --recursive -D -v --one-file-system --exclude /amd
 --exclude /dev --exclude /tmp --exclude /mnt --exclude /proc --exclude /sys --exclude *.avi --exclude *.deb --exclude 
*.mp3 --exclude *subj*/epi/svm.* --exclude */(neurosci|chemistry|campus)/*/*data --exclude */(#|[nN]obackup_)* --ignore
-times . /raid/research/

rsync process has no opened files - just sockets:

> ls -l /proc/9939/fd/
total 3
1 lrwx------ 1 root root 64 Oct  3 08:00 0 -> socket:[207797376]
1 lrwx------ 1 root root 64 Oct  3 08:00 1 -> socket:[207797376]
1 lrwx------ 1 root root 64 Oct  3 08:00 2 -> socket:[207797378]

> strace -fF -p 9939
Process 9939 attached - interrupt to quit
select(1, [0], [], NULL, {10, 236000})  = 0 (Timeout)
select(1, [0], [], NULL, {60, 0}

backtrace (of rsync) is uninformative

(gdb) bt
#0  0xb7eae2f8 in select () from /lib/tls/libc.so.6
#1  0x080660e8 in ?? ()
#2  0x00000001 in ?? ()
#3  0xbfe59724 in ?? ()
#4  0xbfe596a4 in ?? ()
#5  0x00000000 in ?? ()

does it mean that backup server side has actually closed/dropped the connection and that is the reason why rsync doesn't proceed?
or there might be other issues?

I am running Debian unstable with rsync 2.6.8-2

Please advice which way to look... may be it would be worth for me to
recompile rsync with debug symbols to get more details on where the
process stalls...
N.B. I've asked on backuppc mailing list already but got no answer
unfortunately
http://article.gmane.org/gmane.comp.sysutils.backup.backuppc.general/8050

-- 
Yaroslav Halchenko
Research Assistant, Psychology Department, Rutgers-Newark
Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
        101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
Student  Ph.D. @ CS Dept. NJIT


More information about the rsync mailing list