fixes for bugs in error handling in rsync-2.5.2; and updates for rsync3.txt

Greg A. Woods woods at weird.com
Mon Feb 18 11:41:13 EST 2002


Rsync-2.5.2 does not gracefully report connection and transfer errors
and always properly return with a non-zero exit code, despite many
assurances to the contrary in the code and commit logs.  It seems a
kludge to handle a special case of lost connections to older servers was
FAR too aggressive!

With '-vvv' I also print the source of the exit_cleanup() call, and
optionally with '-vvvv' show whether or not the exit code has been
adjusted within _exit_cleanup().

A somewhat related minor style botch in main.c is fixed too.

Also included here are my updated notes on possible improvements for
handling 'moved files'.

Index: cleanup.c
===================================================================
RCS file: /cvsroot/rsync/cleanup.c,v
retrieving revision 1.11
diff -c -r1.11 cleanup.c
*** cleanup.c	23 Mar 2001 01:26:04 -0000	1.11
--- cleanup.c	18 Feb 2002 00:39:34 -0000
***************
*** 40,51 ****
--- 40,56 ----
   */
  void _exit_cleanup(int code, const char *file, int line)
  {
+ 	int ocode = code;
  	extern int keep_partial;
  	extern int log_got_error;
  
  	signal(SIGUSR1, SIG_IGN);
  	signal(SIGUSR2, SIG_IGN);
  
+ 	if (verbose > 3)
+ 		rprintf(FINFO,"_exit_cleanup(code=%d, file=%s, line=%d): entered\n", 
+ 			code, file, line);
+ 
  	if (cleanup_child_pid != -1) {
  		int status;
  		if (waitpid(cleanup_child_pid, &status, WNOHANG) == cleanup_child_pid) {
***************
*** 80,85 ****
--- 85,94 ----
  	}
  
  	if (code) log_exit(code, file, line);
+ 
+ 	if (verbose > 2)
+ 		rprintf(FINFO,"_exit_cleanup(code=%d, file=%s, line=%d): about to call exit(%d)\n", 
+ 			ocode, file, line, code);
  
  	exit(code);
  }
Index: clientserver.c
===================================================================
RCS file: /cvsroot/rsync/clientserver.c,v
retrieving revision 1.84
diff -c -r1.84 clientserver.c
*** clientserver.c	9 Feb 2002 03:30:22 -0000	1.84
--- clientserver.c	18 Feb 2002 00:39:34 -0000
***************
*** 43,48 ****
--- 43,49 ----
  	extern int remote_version;
  	extern int am_sender;
  	extern char *shell_cmd;
+ 	extern int list_only;
  	extern int kludge_around_eof;
  	extern char *bind_address;
  	extern int default_af_hint;
***************
*** 113,119 ****
  
  	/* Old servers may just drop the connection here,
  	 rather than sending a proper EXIT command.  Yuck. */
! 	kludge_around_eof = remote_version < 25;
  
  	while (1) {
  		if (!read_line(fd, line, sizeof(line)-1)) {
--- 114,120 ----
  
  	/* Old servers may just drop the connection here,
  	 rather than sending a proper EXIT command.  Yuck. */
! 	kludge_around_eof = list_only && (remote_version < 25);
  
  	while (1) {
  		if (!read_line(fd, line, sizeof(line)-1)) {
Index: main.c
===================================================================
RCS file: /cvsroot/rsync/main.c,v
retrieving revision 1.140
diff -c -r1.140 main.c
*** main.c	6 Feb 2002 21:20:49 -0000	1.140
--- main.c	18 Feb 2002 00:39:34 -0000
***************
*** 880,887 ****
  
  	ret = start_client(argc, argv);
  	if (ret == -1) 
! 	    exit_cleanup(RERR_STARTCLIENT);
  	else
! 	    exit_cleanup(ret);
! 	return ret;
  }
--- 880,889 ----
  
  	ret = start_client(argc, argv);
  	if (ret == -1) 
! 		exit_cleanup(RERR_STARTCLIENT);
  	else
! 		exit_cleanup(ret);
! 
! 	exit(ret);
! 	/* NOTREACHED */
  }
Index: rsync3.txt
===================================================================
RCS file: /cvsroot/rsync/rsync3.txt,v
retrieving revision 1.3
diff -c -r1.3 rsync3.txt
*** rsync3.txt	12 Sep 2001 14:35:39 -0000	1.3
--- rsync3.txt	18 Feb 2002 00:39:34 -0000
***************
*** 192,199 ****
  
  Scripting issues:
  
!   - Perhaps support multiple scripting languages: candidates include
!     Perl, Python, Tcl, Scheme (guile?), sh, ...
  
    - Simply running a subprocess and looking at its stdout/exit code
      might be sufficient, though it could also be pretty slow if it's
--- 192,200 ----
  
  Scripting issues:
  
!   - Perhaps support multiple scripting languages:  candidates include
!     Perl, Python, Tcl, lisp (librep?), Scheme (siod, guile, elk,
!     minischeme, Kali, STk?), sh, ICI, Lua, Ruby, Pike, smalltalk...
  
    - Simply running a subprocess and looking at its stdout/exit code
      might be sufficient, though it could also be pretty slow if it's
***************
*** 208,220 ****
  
    - Tcl is broken Lisp.
  
    - Lots of sysadmins know Perl, though Perl can give some bizarre or
      confusing errors.  The built in stat operators and regexps might
      be useful.
  
!   - Sadly probably not enough people know Scheme.
  
!   - sh is hard to embed.
  
  
  Scripting hooks:
--- 209,238 ----
  
    - Tcl is broken Lisp.
  
+   - librep is desgined for embedding.
+ 
    - Lots of sysadmins know Perl, though Perl can give some bizarre or
      confusing errors.  The built in stat operators and regexps might
      be useful.
  
!   - Sadly probably not enough people know Scheme, but with the number of
!     scheme-based application scripting languages they're going to have
!     to learn it anyway!
! 
!     - siod is designed for embedding and is very small.
! 
!     - kali is designed for handling distributed executable content.
! 
!     - elk & guile are both designed for embedding.
! 
!   - sh is hard to embed and even a full POSIX shell leaves a lot to be
!     desired as a useful programming language.
  
!   - Ruby is truly object-oriented.
! 
!   - ICI or Pike will keep C programmers happy.
! 
!   - Lua is simple to learn and small and designed for embedding.
  
  
  Scripting hooks:
***************
*** 396,413 ****
      would be useful.
  
  
! Moved files: <http://rsync.samba.org/cgi-bin/rsync.fom?file=44>
! 
!   - There's no trivial way to detect renamed files, especially if they
!     move between directories.
! 
!   - If we had a picture of the remote directory from last time on
!     either machine, then the inode numbers might give us a hint about
!     files which may have been renamed.
  
    - Files that are renamed and not modified can be detected by
!     examining the directory listing, looking for files with the same
!     size/date as the origin.
  
  
  Filesystem migration:
--- 414,477 ----
      would be useful.
  
  
! Moved files:
  
    - Files that are renamed and not modified can be detected by
!     pre-calculating whole-file hash (MD5?) signatures for all files in
!     the target heirarchy (source files need only have their whole-file
!     hash calculated just before they would be transferred).
! 
!     - whenever you're about to copy a whole file to the target hierarchy
!       (there's no matching filename in the target directory) first
!       search for a matching file already in the target hierarchy and if
!       one is found:
! 
!       - if the matching file is missing in the source directory then
!         first try to create the new target file with a hard link
!         (presumably the source file will be deleted, if deletions in the
!         target hierarchy are permitted by the command-line/config options)
! 
!       - if the source file and target directory are on different
!         machines then simply make the copy locally within the target
!         hierarchy on the target machine
! 
!       - if the source file and target directory are on the same machine
!         then make the copy from whichever file is on a different
!         filesystem (st_dev) from the target directory [it is possible
!         the target hierarchy spans two filesystems and thus the existing
!         copy in the target might be in a different filesystem from the
!         target directory]
! 
!     - whenever updating a target file with the normal rsync algorithm
!       first search for duplicates of the current target's whole-file
!       hash value and then update all identical targets simultaneously
!       with the same data blocks from the source file.  Remember the
!       source file's whole-file hash value so that when each of the
!       updated targets is encountered in the source hierarchy the
!       matching source file can be checked to be sure it too is still
!       identical to the initially encountered source file that the update
!       was done from.  [if the source file in the matching location for
!       an already updated duplicate turns out to be different from the
!       source file used to update the duplicate then perhaps it would be
!       good, at least when on different machines, to have a saved copy of
!       the un-touched target so that the previous updates to it can be
!       quickly undone, but this complicates cleanup quite a bit]
!       
!     - all deleted files are handled normally.
! 
!     - all file meta-data are handled normally.
! 
!   - There's no trivial way to detect renamed and modified files, though
!     by also pre-calculating the hash signatures for each block of each
!     file in the target hierarchy then fuzzy matching heuristics (eg. if
!     more than some percentage of blocks are identical) could identify
!     new files which have many blocks in common and thus which could
!     first be copied locally on the target and then updated with the
!     normal rsync algorithm.  Keeping all this data for very large
!     hierarchies might still be too expensive though so perhaps it should
!     only be done if some noticable percentage of large files (savings
!     are only possible if the files are multiple blocks in length) in the
!     target hierarchy are apparently missing and would need copying.
  
  
  Filesystem migration:
***************
*** 466,469 ****
    - http://freshmeat.net/search/?site=Freshmeat&q=mirror&section=projects
  
    - BitTorrent -- p2p mirroring
!     http://bitconjurer.org/BitTorrent/ 
\ No newline at end of file
--- 530,533 ----
    - http://freshmeat.net/search/?site=Freshmeat&q=mirror&section=projects
  
    - BitTorrent -- p2p mirroring
!     http://bitconjurer.org/BitTorrent/ 

-- 
								Greg A. Woods

+1 416 218-0098;  <gwoods at acm.org>;  <g.a.woods at ieee.org>;  <woods at robohack.ca>
Planix, Inc. <woods at planix.com>; VE3TCP; Secrets of the Weird <woods at weird.com>




More information about the rsync mailing list