Rsync lock-up

Michael Kohne mhkohne at discordia.org
Wed Jun 18 23:40:22 EST 2003


jw schultz said:
> On Tue, Jun 17, 2003 at 05:16:11PM -0400, Michael Kohne wrote:
>> I'm getting some odd behaviour from rsync - a lockup when doing local
>> copies. I tried to search the list archives, but I only came up with a
>> couple of hits from 2001 indicating folks thought this (or a similar
>> issue) was fixed.
>>
>> Situation:
>> (OS is RedHat 7.2, Rsync rpm 2.4.6-5 and 2.5.5-1 exhibit the same
>> behaviour) rsync is copying on the same machine (local copy). The
>> destination is NOT a sub-directory of anything that's being copied. I
>> used xargs to give rsync a moderatly large number of files (603). It
>> copies all the files, then locks up after the last copy. I can kill it
>> with control-c. Rsync is being run from a perl script I wrote which
>> first identifies the files to backup, then uses rsync to back them up
>> (hence the large file name list).
>>
>> The interesting part is that when I run my backup script from the
>> command line, it works fine. The problem comes when I have our program
>> run it (via fork & exec, with stdin/out to my program via pipes.)
>> rsync locks up. This does NOT happen when I do rsync to a remote
>> server via ssh (making me think this is related to the bugs I found in
>> the archive).
>>
>> Note that our program is based on pthreads, and uses two sepereate
>> pipes to talk to the backup script - one for data going TO the backup
>> script, one for data comming FROM the backup script. (Our program is a
>> daemon that allows our users to connect to it via a telnet-like
>> program. It then presents them with a very cheesy shell they can use
>> to run commands.)
>>
>> If anyone can give me a good idea as to what it is I'm doing to screw
>> up rsync, I'd appreciate hearing it. I assume it either has something
>> to do with the large number of files or with the interesting stdin/out
>> games I play.
>
> The large file count on the command-line is curious but i'm
> more inclined to think it is the stdin/out/err games.
>
>> My next step will be to play with trying to reduce the number of files
>> I'm passing to rsync, and if that doesn't work, I'll try writing some
>> code to play with stdin/out until I can get a failure in a smaller
>> environment. Other ideas are welcome.
>
> When you discover the cause or at least get a good test case
> let us know.
>
> It would be best if you worked with cvs and not an out of
> date vendor patched binary.
>

Finally figured the problem out. It turns out that our daemon wasn't
clearing the signal mask before execing the child. Rsync seems to use some
signals for the various processes to communicate with each other, and the
non-default signal mask was blocking one or more signals that rsync needed
to complete it's shutdown. Who knows what other things wouldn't have
worked right for the same reason...

Thanks!

-- 
Michael Kohne        mhkohne at discordia.org
"You should be smarter than the equipment you are trying to operate." --
Matt Osborne





More information about the rsync mailing list