[PATCH] make write_batch local

Chris Shoemaker c.shoemaker at cox.net
Wed Jun 16 23:09:46 GMT 2004


Wayne,
	It's taken a little while for me to get more familiar with the
	code, but I think I've reached a good breakpoint in improving
	batch-mode.  Let me highlight some of the changes in the
	attached patch:

	* --write-batch and --read-batch arguments are no longer passed
	 	from client to server.  This fixes the current problem
		that causes the server threads to die when the client
		used --write-batch and the server attempts to create the
		batch file on the server.  This means that new clients
		will actually be able to use batch-mode, even with old
		servers.

	* --write-batch and --read-batch arguments are ignored when
		paired with --server.  This means that new servers will
		begin to work correctly with older clients that still
		send batch arguments to the server.

	* The biggest change is that there are now write-batch code
		paths for both sender and receiver.  (It was only for
		sender before.)  This means that a client can now write
		a local batch file, whether that client is a sender or a
		receiver.  The intent is that both sides would produce
		identical batch files.  I tested this four ways.  1)
		writing a batch file during a local-local sync with an
		old (2.5.7) version, which wrote from the sender side.
		2) writing a batch file during a local-local sync with
		my patched version, which (arbitrarily) writes from the
		receiver side.  3) writing a batch file during a
		remote-to-local sync, with my patched version, which,
		obviously, writes from the receiver.  4) writing a batch
		file during a local-to-remote sync, with my patched
		version, which, obviously, writes from the sender.  All
		four methods produce identical batch files.

	* delta mode (aka --no-whole-file) is now independent of
		batch-mode.  There was some subtle forcing of delta mode
		happening in disable_delta_p(), which I never could
		understand.  Now, delta mode and batch mode are
		compatible in any combination, and if you want delta
		mode to differ from whatever the default is, you must
		explicitly set it so.

	* minor stuff like a few comments, factor one line "argc--"
	 	outside of if-then-else, two spelling fixes.


I hope you have the time to review this patch and comment.  Regarding
next steps, I've come to believe that the entire batch mode code is too
invasive into too many parts of the rest of the code.  If you're willing
to break backward compatibility of existing batch files, I think the
batch mode code can be significantly simplified and made more
maintainable and flexible.  Basically, I think batch mode should just 
record whatever hits the socket.  I don't see much benefit to splitting
up the data into 3 files, essentially creating a batch-mode specific
protocol on the disk, with special reading and writing functions.
Simplify, simplify.

What do you think?

 -chris

-------------- next part --------------
? patches/make-write-batch-local.diff
? patches/patch_fix_read_batch_bug
Index: batch.c
===================================================================
RCS file: /cvsroot/rsync/batch.c,v
retrieving revision 1.32
diff -c -b -d -r1.32 batch.c
*** a/batch.c	15 May 2004 19:31:10 -0000	1.32
--- b/batch.c	17 Jun 2004 04:01:54 -0000
***************
*** 172,177 ****
--- 172,178 ----
  
  /**
   * Write csum info to batch file
+  * If flist_entry < 0, just open the file
   *
   * @todo This will break if s->count is ever larger than maxint.  The
   * batch code should probably be changed to consistently use the
***************
*** 198,203 ****
--- 199,206 ----
  		}
  	}
  
+ 	if (*flist_entry < 0)
+ 		return;
  	write_batch_csums_file(flist_entry, sizeof (int));
  	int_count = s ? (int) s->count : 0;
  	write_batch_csums_file(&int_count, sizeof int_count);
***************
*** 285,290 ****
--- 288,296 ----
  		}
  	}
  
+ 	if (buff == NULL)
+ 		return;
+ 
  	if (write(f_delta, buff, bytes_to_write) < 0) {
  		rsyserr(FERROR, errno, "Batch file %s write error", filename);
  		close(f_delta);
Index: compat.c
===================================================================
RCS file: /cvsroot/rsync/compat.c,v
retrieving revision 1.22
diff -c -b -d -r1.22 compat.c
Index: flist.c
===================================================================
RCS file: /cvsroot/rsync/flist.c,v
retrieving revision 1.230
diff -c -b -d -r1.230 flist.c
*** a/flist.c	11 Jun 2004 07:40:57 -0000	1.230
--- b/flist.c	17 Jun 2004 04:01:55 -0000
***************
*** 950,956 ****
  
  	flist_expand(flist);
  
! 	if (write_batch)
  		file->flags |= FLAG_TOP_DIR;
  
  	if (file->basename[0]) {
--- 950,956 ----
  
  	flist_expand(flist);
  
! 	if (write_batch)  /* uh, why?  TODO: remove and test */
  		file->flags |= FLAG_TOP_DIR;
  
  	if (file->basename[0]) {
***************
*** 1039,1044 ****
--- 1039,1046 ----
  
  
  /**
+  * Despite its name, send_file_list() is called by the receiver.
+  *
   * The delete_files() function in receiver.c sets f to -1 so that we just
   * construct the file list in memory without sending it over the wire.  It
   * also has the side-effect of ignoring user-excludes if delete_excluded
***************
*** 1306,1311 ****
--- 1308,1325 ----
  			else
  				io_error |= read_int(f);
  		}
+ 		if (write_batch) {
+ 		  /* TODO: remove this FLAG_TOP_DIR stuff and test
+ 		   * CAS: I'm not sure what the FLAG_TOP_DIR does,
+ 		   * but we set it here, just so it matches what 
+ 		   * happens for write_batch on the sender paths.
+ 		   */
+ 		  int i;
+ 		  for (i = 0; i < flist->count; i++) 
+ 		    flist->files[i]->flags |= FLAG_TOP_DIR;
+ 		  
+ 		  write_batch_flist_info(flist->count, flist->files);
+ 		}
  	}
  
  	if (verbose > 3)
Index: generator.c
===================================================================
RCS file: /cvsroot/rsync/generator.c,v
retrieving revision 1.87
diff -c -b -d -r1.87 generator.c
*** a/generator.c	11 Jun 2004 07:40:51 -0000	1.87
--- b/generator.c	17 Jun 2004 04:01:55 -0000
***************
*** 48,53 ****
--- 48,54 ----
  extern int local_server;
  extern int read_batch;
  extern int write_batch;
+ extern int read_batch;
  extern int list_only;
  extern int only_existing;
  extern int orig_umask;
***************
*** 124,132 ****
  
  /* 
   * set (initialize) the size entries in the per-file sum_struct
!  * calulating dynamic block ans checksum sizes.
   *
!  * This is only called from generate_and_send_sums() but is a seperate
   * function to encapsulate the logic.
   *
   * The block size is a rounded square root of file length.
--- 125,133 ----
  
  /* 
   * set (initialize) the size entries in the per-file sum_struct
!  * calculating dynamic block and checksum sizes.
   *
!  * This is only called from generate_and_send_sums() but is a separate
   * function to encapsulate the logic.
   *
   * The block size is a rounded square root of file length.
***************
*** 210,223 ****
   *
   * When do we do this?  If the user's explicitly said they
   * want the whole thing, or if { they haven't explicitly
!  * requested a delta, and it's local but not batch mode.}
   *
   * Whew. */
  static BOOL disable_deltas_p(void)
  {
  	if (whole_file > 0)
  		return True;
! 	if (whole_file == 0 || write_batch || read_batch)
  		return False;
  	return local_server;
  }
--- 211,224 ----
   *
   * When do we do this?  If the user's explicitly said they
   * want the whole thing, or if { they haven't explicitly
!  * requested a delta, and it's local. }
   *
   * Whew. */
  static BOOL disable_deltas_p(void)
  {
  	if (whole_file > 0)
  		return True;
! 	if (whole_file == 0)
  		return False;
  	return local_server;
  }
***************
*** 227,242 ****
   * Generate and send a stream of signatures/checksums that describe a buffer
   *
   * Generate approximately one checksum every block_len bytes.
   */
! static void generate_and_send_sums(struct map_struct *buf, size_t len, int f_out)
  {
! 	size_t i;
  	struct sum_struct sum;
  	OFF_T offset = 0;
  
  	sum_sizes_sqroot(&sum, len);
  
  	write_sum_head(f_out, &sum);
  
  	for (i = 0; i < sum.count; i++) {
  		unsigned int n1 = MIN(len, sum.blength);
--- 228,253 ----
   * Generate and send a stream of signatures/checksums that describe a buffer
   *
   * Generate approximately one checksum every block_len bytes.
+  * 
+  * The flist_idx arg is only for the purpose of being able to call
+  * write_batch_csum_info().  Another (better?) way would be to instead
+  * return a pointer to a sum_struct, and then call write_batch_csum_info()
+  * outside.
   */
! static void generate_and_send_sums(struct map_struct *buf, size_t len, int f_out,
! 				   int flist_idx)
  {
! 	size_t i, j;
  	struct sum_struct sum;
  	OFF_T offset = 0;
  
  	sum_sizes_sqroot(&sum, len);
  
  	write_sum_head(f_out, &sum);
+ 	if (write_batch) { /* save all the sums to write out */
+ 		sum.sums = new_array(struct sum_buf, sum.count);
+ 		if (!sum.sums) out_of_memory("generate_and_send_sums");
+ 	}
  
  	for (i = 0; i < sum.count; i++) {
  		unsigned int n1 = MIN(len, sum.blength);
***************
*** 256,265 ****
--- 267,295 ----
  		write_buf(f_out, sum2, sum.s2length);
  		len -= n1;
  		offset += n1;
+ 		
+ 		if (write_batch) {  /* we're receiver */
+ 			/* we only have to save the sums
+ 			 * if we want to write them out */
+ 			sum.sums[i].sum1 = sum1;
+ 			for (j = 0; j < SUM_LENGTH ; j++)
+ 				sum.sums[i].sum2[j] = sum2[j];
+ 		}
+ 	}
+ 	if (write_batch) {
+ 		write_batch_csum_info(&flist_idx, &sum);
+ 		free(sum.sums);
  	}
  }
  
  
+ void write_null_sum(int f_out, int i) {
+ 	write_int(f_out, i);
+ 	if (!dry_run) write_sum_head(f_out, NULL);
+ 	if (write_batch) {
+ 		write_batch_csum_info(&i, NULL);
+ 	}
+ }
  
  /**
   * Acts on file number @p i from @p flist, whose name is @p fname.
***************
*** 453,460 ****
  		if (preserve_hard_links && hard_link_check(file, HL_SKIP))
  			return;
  		if (errno == ENOENT) {
! 			write_int(f_out,i);
! 			if (!dry_run) write_sum_head(f_out, NULL);
  		} else if (verbose > 1) {
  			rsyserr(FERROR, errno,
  				"recv_generator: failed to open %s",
--- 483,489 ----
  		if (preserve_hard_links && hard_link_check(file, HL_SKIP))
  			return;
  		if (errno == ENOENT) {
! 			write_null_sum(f_out, i);
  		} else if (verbose > 1) {
  			rsyserr(FERROR, errno,
  				"recv_generator: failed to open %s",
***************
*** 471,478 ****
  		/* now pretend the file didn't exist */
  		if (preserve_hard_links && hard_link_check(file, HL_SKIP))
  			return;
! 		write_int(f_out,i);
! 		if (!dry_run) write_sum_head(f_out, NULL);
  		return;
  	}
  
--- 500,506 ----
  		/* now pretend the file didn't exist */
  		if (preserve_hard_links && hard_link_check(file, HL_SKIP))
  			return;
! 		write_null_sum(f_out, i);
  		return;
  	}
  
***************
*** 500,507 ****
  	}
  
  	if (disable_deltas_p()) {
! 		write_int(f_out,i);
! 		write_sum_head(f_out, NULL);
  		return;
  	}
  
--- 528,534 ----
  	}
  
  	if (disable_deltas_p()) {
! 		write_null_sum(f_out, i);
  		return;
  	}
  
***************
*** 514,521 ****
  		/* pretend the file didn't exist */
  		if (preserve_hard_links && hard_link_check(file, HL_SKIP))
  			return;
! 		write_int(f_out,i);
! 		write_sum_head(f_out, NULL);
  		return;
  	}
  
--- 541,547 ----
  		/* pretend the file didn't exist */
  		if (preserve_hard_links && hard_link_check(file, HL_SKIP))
  			return;
! 		write_null_sum(f_out, i);
  		return;
  	}
  
***************
*** 533,539 ****
  		rprintf(FINFO, "generating and sending sums for %d\n", i);
  
  	write_int(f_out,i);
! 	generate_and_send_sums(mapbuf, st.st_size, f_out);
  
  	close(fd);
  	if (mapbuf) unmap_file(mapbuf);
--- 559,565 ----
  		rprintf(FINFO, "generating and sending sums for %d\n", i);
  
  	write_int(f_out,i);
! 	generate_and_send_sums(mapbuf, st.st_size, f_out, i);
  
  	close(fd);
  	if (mapbuf) unmap_file(mapbuf);
Index: main.c
===================================================================
RCS file: /cvsroot/rsync/main.c,v
retrieving revision 1.197
diff -c -b -d -r1.197 main.c
*** a/main.c	11 Jun 2004 07:40:54 -0000	1.197
--- b/main.c	17 Jun 2004 04:01:56 -0000
***************
*** 735,743 ****
  		return start_socket_client(host, path, argc-1, argv+1);
  	}
  
! 	if (!read_batch) {
  		p = find_colon(argv[0]);
! 		if (p) {
  			if (remote_filesfrom_file
  			 && remote_filesfrom_file != files_from + 1
  			 && strncmp(files_from, argv[0], p-argv[0]+1) != 0) {
--- 735,743 ----
  		return start_socket_client(host, path, argc-1, argv+1);
  	}
  
! 	if (!read_batch) { /* for read_batch, NO source is specified */
  		p = find_colon(argv[0]);
! 		if (p) { /* source is remote */
  			if (remote_filesfrom_file
  			 && remote_filesfrom_file != files_from + 1
  			 && strncmp(files_from, argv[0], p-argv[0]+1) != 0) {
***************
*** 755,761 ****
  				daemon_over_rsh = 1;
  			}
  
! 			if (argc < 1) {
  				usage(FERROR);
  				exit_cleanup(RERR_SYNTAX);
  			}
--- 755,761 ----
  				daemon_over_rsh = 1;
  			}
  			
! 			if (argc < 1) { /* destination required */
  				usage(FERROR);
  				exit_cleanup(RERR_SYNTAX);
  			}
***************
*** 764,772 ****
  			*p = 0;
  			shell_machine = argv[0];
  			shell_path = p+1;
- 			argc--;
  			argv++;
! 		} else {
  			am_sender = 1;
  
  			/* rsync:// destination uses rsync server over direct socket */
--- 764,771 ----
  			*p = 0;
  			shell_machine = argv[0];
  			shell_path = p+1;
  			argv++;
! 		} else { /* source is local */
  			am_sender = 1;
  
  			/* rsync:// destination uses rsync server over direct socket */
***************
*** 789,795 ****
  				return start_socket_client(host, path, argc-1, argv);
  			}
  
! 			p = find_colon(argv[argc-1]);
  			if (p && remote_filesfrom_file
  			 && remote_filesfrom_file != files_from + 1
  			 && strncmp(files_from, argv[argc-1], p-argv[argc-1]+1) != 0) {
--- 788,794 ----
  				return start_socket_client(host, path, argc-1, argv);
  			}
  
! 			p = find_colon(argv[argc-1]); /* look in dest arg */
  			if (p && remote_filesfrom_file
  			 && remote_filesfrom_file != files_from + 1
  			 && strncmp(files_from, argv[argc-1], p-argv[argc-1]+1) != 0) {
***************
*** 797,803 ****
  					"--files-from hostname is not transfer hostname\n");
  				exit_cleanup(RERR_SYNTAX);
  			}
! 			if (!p) {
  				local_server = 1;
  				if (remote_filesfrom_file) {
  					rprintf(FERROR,
--- 796,802 ----
  					"--files-from hostname is not transfer hostname\n");
  				exit_cleanup(RERR_SYNTAX);
  			}
! 			if (!p) { /* no colon found, so src and dest both local */
  				local_server = 1;
  				if (remote_filesfrom_file) {
  					rprintf(FERROR,
***************
*** 827,835 ****
  				shell_machine = argv[argc-1];
  				shell_path = p+1;
  			}
- 			argc--;
  		}
! 	} else {
  		am_sender = 1;
  		local_server = 1;
  		shell_path = argv[argc-1];
--- 826,834 ----
  				shell_machine = argv[argc-1];
  				shell_path = p+1;
  			}
  		}
! 		argc--;
! 	} else {  /* read_batch */
  		am_sender = 1;
  		local_server = 1;
  		shell_path = argv[argc-1];
***************
*** 852,863 ****
  			shell_path?shell_path:"");
  	}
  
  	if (!am_sender && argc > 1) {
  		usage(FERROR);
  		exit_cleanup(RERR_SYNTAX);
  	}
  
! 	if (argc == 0 && !am_sender) {
  		list_only = 1;
  	}
  
--- 851,864 ----
  			shell_path?shell_path:"");
  	}
  
+ 	/* for remote source, only single dest arg can remain ... */
  	if (!am_sender && argc > 1) { 
  		usage(FERROR);
  		exit_cleanup(RERR_SYNTAX);
  	}
  
! 	/* ... or no dest at all */
! 	if (!am_sender && argc == 0) {
  		list_only = 1;
  	}
  
***************
*** 1028,1033 ****
--- 1029,1038 ----
  
  	if (write_batch && !am_server) {
  		write_batch_argvs_file(orig_argc, orig_argv);
+ 		/* initialize static fds */
+ 		ret = -1;
+ 		write_batch_csum_info(&ret, NULL);
+ 		write_batch_delta_file(NULL, 0);
  	}
  
  	if (am_daemon && !am_server)
Index: options.c
===================================================================
RCS file: /cvsroot/rsync/options.c,v
retrieving revision 1.156
diff -c -b -d -r1.156 options.c
*** a/options.c	7 Jun 2004 22:05:22 -0000	1.156
--- b/options.c	17 Jun 2004 04:01:57 -0000
***************
*** 638,643 ****
--- 638,653 ----
  	}
  #endif
  
+ 	if ((write_batch || read_batch) && am_server) {
+ 		rprintf(FERROR,
+ 			"batch-mode is incompatible with server mode\n");
+ 		write_batch = 0;
+ 		read_batch = 0;
+ 		/* We don't actually exit_cleanup(), so that we can still service
+ 		 * older version clients that still send batch args to server */
+ 		
+ 	}
+ 
  	if (write_batch && read_batch) {
  		rprintf(FERROR,
  			"write-batch and read-batch can not be used together\n");
***************
*** 884,895 ****
  		args[ac++] = arg;
  	}
  
! 	if (batch_prefix) {
! 		char *r_or_w = write_batch ? "write" : "read";
! 		if (asprintf(&arg, "--%s-batch=%s", r_or_w, batch_prefix) < 0)
! 			goto oom;
! 		args[ac++] = arg;
! 	}
  
  	if (io_timeout) {
  		if (asprintf(&arg, "--timeout=%d", io_timeout) < 0)
--- 894,900 ----
  		args[ac++] = arg;
  	}
  
!         /* batch operations are local only, so don't pass batch args */
  
  	if (io_timeout) {
  		if (asprintf(&arg, "--timeout=%d", io_timeout) < 0)
Index: pipe.c
===================================================================
RCS file: /cvsroot/rsync/pipe.c,v
retrieving revision 1.7
diff -c -b -d -r1.7 pipe.c
*** a/pipe.c	15 May 2004 19:31:10 -0000	1.7
--- b/pipe.c	17 Jun 2004 04:01:57 -0000
***************
*** 26,31 ****
--- 26,32 ----
  extern int blocking_io;
  extern int orig_umask;
  extern int read_batch;
+ extern int write_batch;
  extern int filesfrom_fd;
  
  /**
***************
*** 119,124 ****
--- 120,132 ----
  		am_sender = read_batch ? 0 : !am_sender;
  		am_server = 1;
  
+ 		/* There is write_batch code on both the receiver and
+ 		 * sender sides.  In local_child, both are local processes,
+ 		 * so we must make sure that only one actually writes.  It 
+ 		 * shouldn't matter which one -- here we prevent sender from
+ 		 * writing. */
+ 		write_batch = 0; 
+ 		
  		if (!am_sender) 
  			filesfrom_fd = -1;
  
Index: receiver.c
===================================================================
RCS file: /cvsroot/rsync/receiver.c,v
retrieving revision 1.82
diff -c -b -d -r1.82 receiver.c
*** a/receiver.c	14 Jun 2004 15:09:36 -0000	1.82
--- b/receiver.c	17 Jun 2004 04:01:58 -0000
***************
*** 46,51 ****
--- 46,52 ----
  extern int module_id;
  extern int ignore_errors;
  extern int orig_umask;
+ extern int write_batch;
  extern int keep_partial;
  extern int checksum_seed;
  
***************
*** 279,284 ****
--- 280,287 ----
  	sum_end(file_sum1);
  
  	read_buf(f_in,file_sum2,MD4_SUM_LENGTH);
+ 	if (write_batch)
+ 		write_batch_delta_file(file_sum2, MD4_SUM_LENGTH);
  	if (verbose > 2) {
  		rprintf(FINFO,"got file_sum\n");
  	}
***************
*** 334,339 ****
--- 337,345 ----
  			continue;
  		}
  
+ 		if (write_batch)
+ 			write_batch_delta_file((char *) &i, sizeof i);
+ 
  		if (i < 0 || i >= flist->count) {
  			rprintf(FERROR,"Invalid file index %d in recv_files (count=%d)\n",
  				i, flist->count);
Index: sender.c
===================================================================
RCS file: /cvsroot/rsync/sender.c,v
retrieving revision 1.40
diff -c -b -d -r1.40 sender.c
*** a/sender.c	15 May 2004 19:31:10 -0000	1.40
--- b/sender.c	17 Jun 2004 04:01:58 -0000
***************
*** 278,284 ****
  				rprintf (FINFO, "filename=%s is being skipped\n", fname);
  				continue;
  			}
! 		} else  {
  			match_sums(f_out, s, mbuf, st.st_size);
  			log_send(file, &initial_stats);
  		}
--- 278,284 ----
  				rprintf (FINFO, "filename=%s is being skipped\n", fname);
  				continue;
  			}
! 		} else  { /* not read_batch */
  			match_sums(f_out, s, mbuf, st.st_size);
  			log_send(file, &initial_stats);
  		}
Index: token.c
===================================================================
RCS file: /cvsroot/rsync/token.c,v
retrieving revision 1.30
diff -c -b -d -r1.30 token.c
*** a/token.c	6 Jan 2004 05:33:02 -0000	1.30
--- b/token.c	17 Jun 2004 04:01:58 -0000
***************
*** 63,68 ****
--- 63,69 ----
  /* non-compressing recv token */
  static int simple_recv_token(int f,char **data)
  {
+ 	extern int write_batch;
  	static int residue;
  	static char *buf;
  	int n;
***************
*** 74,79 ****
--- 75,82 ----
  
  	if (residue == 0) {
  		int i = read_int(f);
+ 		if (write_batch)
+ 		        write_batch_delta_file((char *) &i, sizeof(int));
  		if (i <= 0) return i;
  		residue = i;
  	}
***************
*** 82,87 ****
--- 85,92 ----
  	n = MIN(CHUNK_SIZE,residue);
  	residue -= n;
  	read_buf(f,buf,n);
+ 	if (write_batch)
+ 	        write_batch_delta_file(buf, n);
  	return n;
  }
  


More information about the rsync mailing list