vanilla rsync 3.0.9 hangs after transferring ~2000 files

Justin T Pryzby justinp at norchemlab.com
Thu Nov 8 16:14:16 MST 2012


On Thu, Nov 08, 2012 at 11:54:52PM +0100, Christian Iversen wrote:
> On 2012-11-02 19:32, Justin T Pryzby wrote:
> >On Fri, Nov 02, 2012 at 02:33:13PM +0100, Christian Iversen wrote:
> >>However, 1 server is giving me a lot of trouble. It has a directory
> >>with (currently) 734088 files in it, and every time I try to backup
> >>this dir, rsync hangs after transferring roughly 2000 files.
> >>Sometimes it's around 1800, sometimes it's over 2100 (I think), but
> >>it's in that ballbark.

> That was a bit tricky, since the whole process hangs shortly after
> being started (the files are not that big). But I managed to strace
> -f -p $PID1 -p $PID2 the whole thing.
Do you know you can strace something while starting it?
strace -f rsync -avz /path/dir $dest 2>/strace/rsync

> [pid 27310] select(6, [5], [], NULL, {60, 0}) = 1 (in [5], left {59,
> 999997})
> [pid 27310] read(5,
> "\4\0\0kA*\0\0\4\0\0kB*\0\0\4\0\0kC*\0\0\4\0\0kD*\0\0"..., 8184) =
> 208
> [pid 27310] select(6, [5], [1], [1], {60, 0}
> 
> 
> That {59, 99999x} looks highly suspicious. rsync wouldn't happen to
> hold open 100000 file handles, would it? Or something to that
> effect?

No, that means FD#5 has data waiting; and, of the 60 second theshhold
given to "select", 59.999997 seconds remain (it returned almost
immediately).

BTW, you can check file descriptors in /proc/27310/fd/ (and, in recent
kernels, fdinfo/ has even more).  BTW, symlinks in /proc are
"special".  Eg. if a process opens /file as FD#3, and then the user
"unlinks" that file, and creates a new file with the same name (but
different inode), referencing the pathname will access the new file.
But, /proc/pid/fd/3 will access the old file.

Could you strace the rsync client, too?

> >What filesystem is it?  Can you create a tarball of the directory?
> 
> Tarball could be problematic. It's 8.6GB of data spread over 734088
> files in a single dir. Yes, that many. Yes, I know it's insane. No,
> I didn't choose it :)
> 
> >time c /path/to/dir |wc
> 
> What's "c"?
Oops, I meant:

time tar c /path/to/dir |wc

That creates a tarball piped to wc.  If a tarball is problematic, then
rsync may be a problem, too.  But it seems you were able to run "find"
on that dir just fine.

What filesystem is that?  I believe "huge dirs" is something *some*
filesystems specificaly intend to handle gracefully, but some
filesystems handle poorly.  Some programs may handle it poorly, too :)
Could you also send the output of df -i for that partition?

Justin


More information about the rsync mailing list