RSync hangs up on file transfer

Andre Alexander Bell andre.bell at gmx.de
Fri Aug 29 22:56:47 EST 2003


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello!

I'm using RSync to mirror from a server (gandalf) to a backupserver (dwalin).
RSync does just fit my needs for what I want it to do and I would like to 
thank your for your work on RSync. Nevertheless I do have a problem which 
actually comes up from time to time.
The syncing of each partition is done by rsync calls from cron.daily.
This is one of these calls:

rsync -a -v -P --delete --numeric-ids --rsh="ssh" root at gandalf:/images/ \
	/backup/images &> /var/log/rsync/backup_gandalf_images

The other calls are similar and differ just in their pathes. Actually this 
last call sometimes hangs up. This means RSync does not terminate with error 
nor does it without error. The RSync process is still 'running' but there is 
no file transport anymore. Within the logfile it shows up at one last line 
like this:

cytometer/C12883-98/fisher/C12883-98_FEU_0073_0000_063x_MASK.tif
     1247804 100%    1.29MB/s    0:00:00

The file and directory within which this happens differs from time to time.
Depending on the number of files that need to be transfered (about 300k files 
on that partition and about 90k files to be synced) I can kill, restart, kill 
and restart again and again until RSync is finished, but this is not nice :(.
While RSync did hang up I got this output from free:

dwalin (Debian GNU/Linux, RSync 2.5.6 pv 26):
             total       used       free     shared    buffers     cached
Mem:        127496     123364       4132          0      16772      44104
- -/+ buffers/cache:      62488      65008
Swap:       497972     107600     390372

gandalf (SuSE GNU/Linux, RSync 2.5.6 pv 26):
             total       used       free     shared    buffers     cached
Mem:        256292     251052       5240          0      58360     113760
- -/+ buffers/cache:      78932     177360
Swap:       995988      44560     951428

So both systems have still left swapspace.
I've read through the RSync man-pages and FAQ on the net. There it has been 
suggested to play around with some options like blocking and non-blocking-io 
and bwlimit. This didn't change anything.
I furthermore did leave RSync alone a whole weekend, but no further files have 
been processed.
Finally I did an strace run of RSync with:

strace -f -ff -o strace.out rsync -a -v -P --delete --numeric-ids \
	--rsh="ssh" root at gandalf:/images/ /backup/images/ \
	&> /var/log/rsync/backup_gandalf_images

This outputs are sized about 300MB all together and do contain the systemcalls 
from start up to my final kill of RSync.
Within this output the following is repeated again and again at the end:

select(8, [7], [4], NULL, {60, 0})     = 0 (Timeout)
select(8, [7], [4], NULL, {60, 0})     = 0 (Timeout)

I actually do have no idea what causes this Timeout.
If it yould be of any value I even can offer the strace output-files as 
download or add any other information you may find usefull here.
Do you have any suggestion what I can do to find the problem?
Is there any other internet source I've missed and which I should read to get 
rid of this problem?

Thanks in advance

- -- 
Andre Bell <andre.bell at gmx.de>
PGP-Public-Key: http://www.andre-bell.de/public_key.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/T02RnuHMhboRh6QRAlhoAJ4t2wD4RnWCcY4MTYYPY1+ZNyEUVwCeLJ1b
7luF+vxxwim0CbFaBl4F9/s=
=/qyu
-----END PGP SIGNATURE-----




More information about the rsync mailing list