A weird solution to the rsync zombie (hanging) processes on
win32/cygwin
Tevfik Karagülle
tevfik at itefix.no
Fri Nov 21 10:20:15 EST 2003
Hi,
I've been following discussions around rsync hanging problems on cygwin for
awhile. Since the problem occurs at the end of the pipeline after files are
copied successfully, I thought we can make a kind of workaround : Kill those
processes in a controlled way.
My solution is a perl script : zombrsync.pl. This script does the following
:
<Endless loop start>
- get a before-snapshot of all rsync processes
- wait (60 secs pr default, can be adjusted from command line)
-get a after-snapshot of all rsync process again
- Skip rsync daemons (no tty, and different window and posix pids,
something from cygrunsrv I guess!!)
- Check if the process used processor time (user+system) during our
snapshot interval
- If the answer is no, then this process is a zombie according to our
criteria - KILL IT.
<Endless loop end>
The script requires that you start rsync via cygrunsrv and cygwin ps command
is available. You need also Win32::Process::Info and all related perl
modules (Win32::API and so on !!)
I tested it on a Windows 2000 machine where I had problems. It works quite
well. A compiled version (EXE file) can be made available at request. I also
have plans to enhance my cwRsync solution (http://itefix.no/cwrsync) with
this feature. Daemonizing, some tests and packaging must be done first.
I will appreciate if some of you can test the script on your platforms and
give me feedback. Here's script :
------------------------------------ zombrsync.pl
start ----------------------------------------------
#######################################
#
# zombrsync.pl - Kills rsync zombie processes on Win32/cygwin platform
#
# v0.5 - Beta version (last one hopefully!), tevfik at itefix.no,
http://itefix.no/itefix-en
#
# Usage : zombrsync.pl [-i | --interval <checkpoint interval in seconds>]
(default 60 secs)
use strict;
use Win32::Process;
use Win32::Process::Info;
use Getopt::Long;
our $interval = 60;
our @winpid = ();
our %pid = ();
our %tty = ();
our ($pi, @apsinfo, @bpsinfo);
GetOptions ('interval=i' => \$interval)
or die "Usage : $0 [-i | --interval <seconds>]\n";
$pi = Win32::Process::Info->new ();
# Behave yourself as a stupid daemon !!
while (1) {
# Get BEFORE-snapshot
GetPslist ();
@apsinfo = $pi->GetProcInfo (@winpid);
sleep ($interval); # Wait
# Get AFTER-snapshot
GetPslist ();
@bpsinfo = $pi->GetProcInfo (@winpid);
# Check zombies
foreach my $process (@winpid) {
# Do nothing if rsync is not attached to a tty AND
# windows process id is not equal to posix process id (daemon?)
next if (($process != $pid {$process}) && ( $tty {$process} eq '?'));
# Well, here we have a candidate, let's check if no processor time is used
# during our checkpoint interval
my $aps = $apsinfo[$process];
my $bps = $bpsinfo[$process];
my $atime = $aps->{'KernelModeTime'} + $aps->{'UserModeTime'};
my $btime = $bps->{'KernelModeTime'} + $bps->{'UserModeTime'};
if ($atime == $btime) { # Got it, a zombie according to our criteria
Win32::Process::KillProcess($process, -1) ;
print localtime () .": rsync zombie process $process killed.\n";
}
}
}
#############################
#
# GetPslist : uses cygwin ps command to get a picture of process status
#
sub GetPslist {
open PSFILE, "..\\ps -a |" or
die "Problems during opening pipe : $!\n";
@winpid = %pid = %tty = (); # Initialize global structures
while (<PSFILE>) {
next if ! (/rsync$/); # only interested in lines ending with rsync
split; # ps -a output : PID PPID PGID WINPID TTY ....
push @winpid, $_[3];
$pid { $_[3] } = $_[0]; # winpid --> pid
$tty { $_[3] } = $_[4]; # winpid --> tty
}
close PSFILE;
}
--------------------------------------------- zombrsync.pl
end ----------------------------------------
Enjoy!
Rgrds Tev
More information about the rsync
mailing list