vanilla rsync 3.0.9 hangs after transferring ~2000 files

Christian Iversen ci at meebox.net
Fri Nov 2 07:33:13 MDT 2012


Hello rsync folks

I'm trying to use rsync for backing up our servers. This mostly works extremely well, with no problems.

However, 1 server is giving me a lot of trouble. It has a directory with (currently) 734088 files in it, and every time I try to backup this dir, rsync hangs after transferring roughly 2000 files. Sometimes it's around 1800, sometimes it's over 2100 (I think), but it's in that ballbark.

If I exclude the large directory, rsync completes the backup successfully (albeit incompletely, of course).

I'm running Debian Stable on both the client and server, fully updated. I thought maybe a Debian patch could be interfering, so I've tried vanilla 3.0.9 rsync straight from the tgz, but that gives the same problem.

Things I've already tried:

 - Different MTU
 - Disabling/enabling compression in rsync (-z)
 - Using --protocol=29
 - Other variations on arguments to rsync
 - Simply waiting for it to finish (it will sit there for literally days).

This is what it looks like with -vvv:

...
false_alarms=0 hash_hits=0 matches=0
sender finished data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f272336db751064bc808f3c5131f84f.jpg
recv_generator(data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f2820d5a1c9d1d2b01988138523fe9c.jpg,54940)
send_files(54940, data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f2820d5a1c9d1d2b01988138523fe9c.jpg)
send_files mapped data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f2820d5a1c9d1d2b01988138523fe9c.jpg of size 4866
calling match_sums data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f2820d5a1c9d1d2b01988138523fe9c.jpg
data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f2820d5a1c9d1d2b01988138523fe9c.jpg
sending file_sum
false_alarms=0 hash_hits=0 matches=0
sender finished data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f2820d5a1c9d1d2b01988138523fe9c.jpg
recv_generator(data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f299dc6bd996d789b3fc7e73b6e86dd.jpg,54941)
send_files(54941, data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f299dc6bd996d789b3fc7e73b6e86dd.jpg)
send_files mapped data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f299dc6bd996d789b3fc7e73b6e86dd.jpg of size 4488
calling match_sums data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f299dc6bd996d789b3fc7e73b6e86dd.jpg
data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f299dc6bd996d789b3fc7e73b6e86dd.jpg
sending file_sum
false_alarms=0 hash_hits=0 matches=0
sender finished data/www/virtual/_www.ageforce.dk/wwwroot/usrPictures/110x140_9f299dc6bd996d789b3fc7e73b6e86dd.jpg


After this is Simply Just Hangs.

With strace on the client and server, I can see that they are
both stuck in a select() loop. I also tried running the client with ltrace, and after a GOOD long while, I got this output:

http://i.imgur.com/wYRDO.png

(I couldn't make copy-paste work from that terminal).



Do you have any ideas what the problem might be? Or how I can help debug it? Right now we have a customer we simply cannot do a backup for, which is pretty bad :-)

Thanks in advance for any input.

-- 
De bedste hilsner,

Christian Iversen
Systemadministrator, Meebox.net

-------
Denne e-mail kan indeholde fortrolige 
oplysninger. Er du ikke den rette modtager,
bedes du returnere og slette denne e-mail.
------- 


More information about the rsync mailing list