system hanging

Jon Cooper CooperJ at westmancom.com
Mon May 27 09:35:02 EST 2002


I'm having a problem with one of my Redhat boxes, and I'm not sure if it's
an r-sync problem or not.

The box's function is as a mail server.  Every three hours I rsync the mail
spools, and some other directories, to a backup server which initiates the
connection (rsync -avre /usr/bin/ssh --delete
xxx.xxx.xxx.xxx:/var/spool/mail /var/spool)  The rsync is running as root on
both ends.

The system has hung now about 6 times over the last 5 or 6 weeks.  Every
time it's hung, it's been within the 30mins that the rsync would be running.
When it hangs, I'm getting no log entries of any failure, and nothing dumped
to screen.  I can switch between tty's and type away to me hearts content,
but no response.  The box is totally dead, and the only way to get it back
is the power switch.

I saw a post from Paul from Mar 19th that referenced the fork failure
killing all user processes.  It also referenced a bug that could lead to
hundreds of hung processes.
It looks like the system is restriced to 2047 user processes (ulimit -a).  I
tried duplicating this on a non-production box, but to no avail.

Is there a way that I can verify if one or both of these bugs is causing my
problems?

At first I thought that this might have been a hardware issue, but after
about 8 hours on the phone with IBM we've pretty much ruled that out now.
The server is running Redhat 7.2 with the latest rsync rpm from redhat
(rsync-2.4.6-13).

Thanks,
Jon Cooper
Network Analyst
Westman Business
a division of Westman Communications Group
204.726.8839
cooperj at westmancom.com





More information about the rsync mailing list