SIGUSR1 or SIGINT error

tim.conway at philips.com tim.conway at philips.com
Sat Feb 9 06:19:59 EST 2002


Well, it ran to completion this way, in about 7.5h, but i'm not certain i 
believe it.  While i left --delete --force off (I have been horribly 
burned testing those on really big chunks before),  I would expect that 
the destination would then end up with at least as much as the source. big 
and big1 are subdirectories of the same volume, and its only contents 
aside from very small directory containing between 10 and 20 Kb of 
scripts, so i don't see how the destination could end up 6-1/2Gb short.
When my current operations complete, I'll try one with all the options 
turned on, and run the filesystem map generator from my project to see 
what differences it left.


I have an idea of a mod to make the hard links check more efficient, but I 
don't understand C well enough.  What i was thinking of was to keep the 
st_nlink part of the stat, and if it'snot a directory and nlink >1, save 
the path and inode in a seperate list. and leave them out of the main 
flist.  That way, there's no processing of the items for which there's no 
possibility of a need to track hard links, then fix only one copy of each 
linked file, delete all the others, and link them back to it.

I'm guessing that's a complete redo of the protocol, though.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Tools at lonnetsvr
/users/Tools>cat doit
#!/bin/sh

/cadappl/encap/packages/rsync-cvs/bin/rsync 
--rsync-path=/cadappl/encap/packages/rsync-cvs/bin/rsync -WHav --stats 
--progress alta:/wan/lon-tools1/lon-tools1/big* /wan/lon-tools2/lon-tools2 
>doit.log 2>&1 </dev/null &

Tools at lonnetsvr
/users/Tools>grep ' ' doit.log 
receiving file list ... done
big/tools/Tools/.microsoft/Favorites/Channels/Arcadia Bay Demo Channel/
big/tools/Tools/.microsoft/Favorites/Channels/The Microsoft Channel/
big1/cadappl1/hpux/iclibs/CMOS18/EXTERNALS/PcCMOS18sfliolib_nlm_ex/2.1/tools/adf/vital/sfliolib_nlm 
-> ../../vha/sfliolib_nlm
big1/cadappl1/hpux/iclibs/CMOS18/EXTERNALS/PcCMOS18shliolib_nlm_ex/2.1/tools/adf/vital/shliolib_nlm 
-> ../../vha/shliolib_nlm
big1/cadappl1/hpux/iclibs/CMOS18/PcCMOS18flviolib_spm/2.1.1/lib/flviolib_spm.src 
-> ../tools/vital/timing/flviolib_spm.src
big1/cadappl1/hpux/iclibs/CMOS18/PcCMOS18flviolib_spm/2.1.1/tools/adf/vital/flviolib_spm 
-> ../../vha/flviolib_spm
big1/cadappl1/hpux/latest -> /cadappl/perl/5.6.1
Number of files: 2727469
Number of files transferred: 0
Total file size: 114067347318 bytes
Total transferred file size: 0 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 68790028
Total bytes written: 16
Total bytes read: 68790044
wrote 16 bytes  read 68790044 bytes  2531.79 bytes/sec
total size is 114067347318  speedup is 1658.20
Tools at lonnetsvr
/users/Tools>df -k /wan/lon-tools*/big/tools
Filesystem            kbytes    used   avail capacity  Mounted on
lon-tools1:big       150147795 121588653 28559142    81% 
/wan/lon-tools1/big
lon-tools2:big       150147795 115027617 35120178    77% 
/wan/lon-tools2/big
Tools at lonnetsvr
/users/Tools>
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Tim Conway
tim.conway at philips.com
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(nnnnnnnnnnnn, 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
".\n" '
"There are some who call me.... Tim?"




Dave Dykstra <dwd at bell-labs.com>
Sent by: rsync-admin at lists.samba.org
02/07/2002 10:28 AM

 
        To:     David Birnbaum <davidb at chelsea.net>
        cc:     Tim Conway/LMT/SC/PHILIPS at AMEC
Eric Whiting <ewhiting at amis.com>
rsync at lists.samba.org
        Subject:        Re: SIGUSR1 or SIGINT error
        Classification: 




The fix that went into 2.5.0 was for timeouts that were happening even 
when
--timeout=0 (the default).  Can any of you say for sure that it makes a
difference with a new version when you go from --timeout=0 to a very large
timeout?  I want to see if Tim's experience with timeouts defaulting to 60
seconds is still happening, or if that was only something earlier.  Of
course, it's also entirely possible that the "SIGUSR1 or SIGINT error"
message is being caused by a different problem.

- Dave Dykstra

On Thu, Feb 07, 2002 at 10:22:23AM -0500, David Birnbaum wrote:
> I'm running 2.5.2.  However, we had the same type of problem with 2.4.6,
> which is what we were running before.  If I had to guess, I would say
> that we're seeing this error a little more often in 2.5.2.
> 
> David.
> 
> -----
> 
> On Thu, 7 Feb 2002 tim.conway at philips.com wrote:
> 
> > Currently 2.5.1pre3.  I haven't tested that problem lately, though. 
I'll
> > get the newest up and try a full sync.  It's worth a try.  I'll feel
> > really stupid, though, if i've put all this work into newsync (perl
> > driving find|diff|tar|lzop) and it's fixed in rsync.  I think our case
> > will always create problems, though, with the broken nfs unlink in the
> > nfs3 interface on the NAS, and the broken nfs2 client on the solaris
> > machines (mtime bug).  I won't let this influence my test, though ;-).
> >
> > Tim Conway
> > tim.conway at philips.com
> > 303.682.4917
> > Philips Semiconductor - Longmont TC
> > 1880 Industrial Circle, Suite D
> > Longmont, CO 80501
> > Available via SameTime Connect within Philips, n9hmg on AIM
> > perl -e 'print pack(nnnnnnnnnnnn,
> > 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970),
> > ".\n" '
> > "There are some who call me.... Tim?"
> >
> >
> >
> >
> > Dave Dykstra <dwd at bell-labs.com>
> > Sent by: rsync-admin at lists.samba.org
> > 02/06/2002 03:41 PM
> >
> >
> >         To:     Eric Whiting <ewhiting at amis.com>
> >         cc:     Tim Conway/LMT/SC/PHILIPS at AMEC
> > David Birnbaum <davidb at chelsea.net>
> > rsync at lists.samba.org
> >         Subject:        Re: SIGUSR1 or SIGINT error
> >         Classification:
> >
> >
> >
> > Looks like a fix for that went into 2.5.0.  See revision 1.87 at
> >     http://cvs.samba.org/cgi-bin/cvsweb/rsync/io.c
> >
> > Tim & David, what version are you running?
> >
> > 2.5.2 has some serious problems, Eric.  Try the latest development
> > snapshot at
> >     rsync://rsync.samba.org/ftp/unpacked/rsync/
> > or
> >     ftp://rsync.samba.org/pub/unpacked/rsync/
> >
> > - Dave Dykstra
> >
> >
> > On Wed, Feb 06, 2002 at 11:33:43AM -0700, Eric Whiting wrote:
> > > Make that 2 of us who need to specify a large timeout.
> > >
> > > I have found that I have to set the timeout to a large value (10000) 
to
> > > get the rsyncs to run successfully. Leaving it at the default seemed 
to
> > > cause timeout/hang problems.  Of course I still running a 2.4.6dev
> > > version. I had troubles with 2.5.[01]. (solaris/linux mix of of 
rsync
> > > clients/servers)
> > >
> > > I need to try 2.5.2 as soon as I get a chance. Looks like some good
> > > fixes are happening in 2.5.2.
> > >
> > > eric
> > >
> > >
> > >
> > > On Wed, 2002-02-06 at 10:39, tim.conway at philips.com wrote:
> > > > When i was getting these, I traced the process and its children
> > (solaris:
> > > > truss -f).  I found that one of the spawned threads was 
experiencing
> > an io
> > > > timeout while the filelist was building.  I had set no timeout, 
but it
> > did
> > > > it at 60 seconds every time.  I found that this corresponded to a
> > > > SELECT_TIMEOUT parameter, which was set to 60 if IO_TIMEOUT was 0. 
 BY
> >
> > > > setting my timeout to 86400 (1 day), i stopped those.  Of course,
> > then, it
> > > > choked farther along, but that's another story.
> > > > Try setting a timeout, even if you don't want one.  Make it the
> > longest
> > > > the process should ever take.
> > > >
> > > > Tim Conway
> > > > tim.conway at philips.com
> > > > 303.682.4917
> > > > Philips Semiconductor - Longmont TC
> > > > 1880 Industrial Circle, Suite D
> > > > Longmont, CO 80501
> > > > Available via SameTime Connect within Philips, n9hmg on AIM
> > > > perl -e 'print pack(nnnnnnnnnnnn,
> > > >
> > 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970),
> > > > ".\n" '
> > > > "There are some who call me.... Tim?"
> > > >
> > > >
> > > >
> > > >
> > > > Dave Dykstra <dwd at bell-labs.com>
> > > > Sent by: rsync-admin at lists.samba.org
> > > > 02/06/2002 10:16 AM
> > > >
> > > >
> > > >         To:     David Birnbaum <davidb at chelsea.net>
> > > >         cc:     rsync at lists.samba.org
> > > > (bcc: Tim Conway/LMT/SC/PHILIPS)
> > > >         Subject:        Re: SIGUSR1 or SIGINT error
> > > >         Classification:
> > > >
> > > >
> > > >
> > > > On Tue, Feb 05, 2002 at 11:28:54AM -0500, David Birnbaum wrote:
> > > > > I suspected that might be the case...now...how to determine the
> > "real"
> > > > > problem?  Does rsync log it somewhere?  lsof shows that
> > STDERR/STDOUT
> > > > are
> > > > > going to /dev/null, so I hope it's not writing it there. Nothing
> > > > > informative in syslog, just the message about the SIG:
> > > > >
> > > > >   Feb  5 09:49:41 hite rsyncd[9279]: [ID 702911 daemon.warning]
> > rsync
> > > > error: received SIGUSR1 or SIGINT (code 20) at rsync.c(229)
> > > > >
> > > > > Any clues?
> > > >
> > > >
> > > > I'm sorry, but I don't have any more suggestions.
> > > >
> > > > - Dave Dykstra
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> >
> >








More information about the rsync mailing list