vanilla rsync 3.0.9 hangs after transferring ~2000 files
Justin T Pryzby
justinp at norchemlab.com
Thu Nov 8 16:14:16 MST 2012
On Thu, Nov 08, 2012 at 11:54:52PM +0100, Christian Iversen wrote:
> On 2012-11-02 19:32, Justin T Pryzby wrote:
> >On Fri, Nov 02, 2012 at 02:33:13PM +0100, Christian Iversen wrote:
> >>However, 1 server is giving me a lot of trouble. It has a directory
> >>with (currently) 734088 files in it, and every time I try to backup
> >>this dir, rsync hangs after transferring roughly 2000 files.
> >>Sometimes it's around 1800, sometimes it's over 2100 (I think), but
> >>it's in that ballbark.
> That was a bit tricky, since the whole process hangs shortly after
> being started (the files are not that big). But I managed to strace
> -f -p $PID1 -p $PID2 the whole thing.
Do you know you can strace something while starting it?
strace -f rsync -avz /path/dir $dest 2>/strace/rsync
> [pid 27310] select(6, [5], [], NULL, {60, 0}) = 1 (in [5], left {59,
> 999997})
> [pid 27310] read(5,
> "\4\0\0kA*\0\0\4\0\0kB*\0\0\4\0\0kC*\0\0\4\0\0kD*\0\0"..., 8184) =
> 208
> [pid 27310] select(6, [5], [1], [1], {60, 0}
>
>
> That {59, 99999x} looks highly suspicious. rsync wouldn't happen to
> hold open 100000 file handles, would it? Or something to that
> effect?
No, that means FD#5 has data waiting; and, of the 60 second theshhold
given to "select", 59.999997 seconds remain (it returned almost
immediately).
BTW, you can check file descriptors in /proc/27310/fd/ (and, in recent
kernels, fdinfo/ has even more). BTW, symlinks in /proc are
"special". Eg. if a process opens /file as FD#3, and then the user
"unlinks" that file, and creates a new file with the same name (but
different inode), referencing the pathname will access the new file.
But, /proc/pid/fd/3 will access the old file.
Could you strace the rsync client, too?
> >What filesystem is it? Can you create a tarball of the directory?
>
> Tarball could be problematic. It's 8.6GB of data spread over 734088
> files in a single dir. Yes, that many. Yes, I know it's insane. No,
> I didn't choose it :)
>
> >time c /path/to/dir |wc
>
> What's "c"?
Oops, I meant:
time tar c /path/to/dir |wc
That creates a tarball piped to wc. If a tarball is problematic, then
rsync may be a problem, too. But it seems you were able to run "find"
on that dir just fine.
What filesystem is that? I believe "huge dirs" is something *some*
filesystems specificaly intend to handle gracefully, but some
filesystems handle poorly. Some programs may handle it poorly, too :)
Could you also send the output of df -i for that partition?
Justin
More information about the rsync
mailing list