batch mode maintainability

Jos Backus josb at cncdsl.com
Wed Jan 16 06:11:55 EST 2002


    Hi Dave,

On Tue, Jan 15, 2002 at 12:29:05PM -0600, Dave Dykstra wrote:
> I'm sorry but I really haven't looked at it closely so I can't judge the
> impact on the rsync code, and I haven't tried it yet.  I think Tridge would
> be in a much better position to judge it's value vs. the code impact.
> It's documentation in the man page is quite pathetic, I can see that.

I guess I am partially to blame for the lack of documentation. Some work has
been done here at work on using the batch feature internally, which is why I
ported the patch in the first place. I'm also sorry for not doing a better job
testing the patch. I figured I'd get around to adding some documentation when
we would start using the system here, with some feedback from my coworkers.

> Am I correct in my understanding that it is supposed to allow you to
> generate a file list once on a machine and push it to multiple identical or
> similar target machines at lower overhead than regenerating the file list
> every time?  A lot of people have asked for that.

Correct. Instead of pushing the updates across the wire, rsync stores them in
files (think record/playback).

> What happens if one of the target machines isn't exactly identical, does it
> still work?

Supposedly the update will fail for that particular file. We are planning to
use a file verification tool such as integrit in combination with rsync to
detect this situation in advance, but the tool should deal with this situation
in an acceptable fashion.

> I decided to play with it a little.  I applied the patch at
>     http://lists.samba.org/pipermail/rsync/2001-December/005550.html
> to get it work.  I agree that it shouldn't use the the short names -F
> and -f, and those are removed with the patch.

Agreed, non-mainstream options should not use up short option names, which is
why I removed them.

> I do not at all like the
> user interface of generating the four files with a timestamp that a 
> program has to guess at; it would be much better for the user interface
> to be that the --write-batch passes the same value that is used for
> --read-batch, and a prefix would be better than a suffix so a user can
> choose a different directory to write the files into.

I agree, and I had planned to work on that problem as well because the current
mechanism is not very elegant, to put it mildly. The prefix suggestion is a
very useful one indeed, it would make the tool more flexible.

> Here's what I
> tried:
>     mkdir s d1
>     echo "rev 1" >f1
>     echo "rev 1" >f2
>     rm -f rsync_*.*
>     rsync -av --write-batch s/ d1/
>     rsync -av --read-batch `echo rsync_argvs.*|sed 's/.*\.//'` s/ d1/
> 
> Nothing got copied.  Isn't that supposed to work, Jos?

Yes, but the patch is broken (again, my apologies). Can you try this with the
patch below (which I sent to Martin only for fear of embarrassing myself even
more in public :)?

> I tracked down the
> rsync+paper (after some trouble guessing at values for broken hyperlinks)
> to
>     http://dsi.internet2.edu/files/pdf/TR-1999-01.pdf
> and an example in there said you're supposed to execute the rsync_argv.*
> file.  First of all I noticed that that file is using the "--write-batch"
> option when it should use "--read-batch".  Other than that it has the same
> thing as what I ran by hand, and it doesn't work.

See the patch below; I was sleeping at the wheel when I made that change.

> I see from the paper that the batching is supposed to (when it works) do
> even more than I thought because you can take all the rsync_* files and
> send them to the target machines and then everything runs on the target
> machine only.  The idea with that is that the files would be suitable for
> multicasting to all the machines at the same time.  That seems like a
> pretty powerful way to do it, and I would think it would be faster even
> for people who use unicast because the workload would get distributed across
> across all the targets.

That's exactly the reason why we (and presumably others) are interested,
because it scales so well.

> I'm concerned about what would happen if one of the targets was slightly
> different than the one used as a prototype, though.

That's a valid concern, yes.

> It looks like at a minimum a lot more testing needs to be done for these
> options, and a lot more documentation written for the man page.

Agreed. Can you try again with this patch and let me know how well it works?

Index: batch.c
===================================================================
RCS file: /cvsroot/rsync/batch.c,v
retrieving revision 1.7
diff -u -r1.7 batch.c
--- batch.c	11 Jan 2002 08:25:32 -0000	1.7
+++ batch.c	15 Jan 2002 19:07:53 -0000
@@ -35,6 +35,7 @@
 		timeptr->tm_year + 1900, timeptr->tm_mon + 1,
 		timeptr->tm_mday, timeptr->tm_hour, timeptr->tm_min,
 		timeptr->tm_sec);
+	rprintf(FINFO,"batch file extension: %s\n", batch_file_ext);
 }
 
 void set_batch_file_ext(char *ext)
@@ -128,7 +129,7 @@
 	}
 }
 
-void write_batch_argvs_file(int orig_argc, int argc, char **argv)
+void write_batch_argvs_file(int argc, char *argv[])
 {
 	int fdb;
 	int i;
@@ -149,21 +150,22 @@
 	buff[0] = '\0';
 	/* Write argvs info to batch file */
 
-	for (i = argc - orig_argc; i < argc; i++) {
-		/* FIXME: This apparently crashes if rsync is run with
-		 * just "rsync -F".  I think directly manipulating
-		 * argv[] is probably bogus -- what if -F is part of a
-		 * run of several short options? */
-		if (!strcmp(argv[i], "-F")) {	/* safer to change it here than script */
-			strncat(buff, "-f ", 3);	/* chg to -f + ext to get ready for remote */
-			strncat(buff, batch_file_ext,
-				strlen(batch_file_ext));
+	for (i = 0; i < argc; ++i) {
+		/*
+		 * FIXME:
+		 * I think directly manipulating argv[] is probably bogus
+		 */
+		if (!strcmp(argv[i], "--write-batch")) {
+			/* Safer to change it here than script */
+			/* Change to --read-batch + ext * to get ready for remote */
+			strlcat(buff, "--read-batch ", sizeof(buff));
+			strlcat(buff, batch_file_ext, sizeof(buff));
 		} else {
-			strncat(buff, argv[i], strlen(argv[i]));
+			strlcat(buff, argv[i], sizeof(buff));
 		}
 
 		if (i < (argc - 1)) {
-			strncat(buff, " ", 1);
+			strlcat(buff, " ", sizeof(buff));
 		}
 	}
 	if (!write(fdb, buff, strlen(buff))) {
Index: main.c
===================================================================
RCS file: /cvsroot/rsync/main.c,v
retrieving revision 1.136
diff -u -r1.136 main.c
--- main.c	11 Jan 2002 07:26:39 -0000	1.136
+++ main.c	15 Jan 2002 19:07:54 -0000
@@ -803,8 +803,10 @@
 	extern int write_batch;  /*  dw */
 	extern char *batch_ext;   /*  dw */
 	int orig_argc;  /* dw */
+	char **orig_argv;
 
 	orig_argc = argc;   /* dw */
+	orig_argv = argv;
 
 	signal(SIGUSR1, sigusr1_handler);
 	signal(SIGUSR2, sigusr2_handler);
@@ -844,7 +846,7 @@
 
 	if (write_batch) { /* dw */
 	    create_batch_file_ext();
-	    write_batch_argvs_file(orig_argc, argc, argv);
+	    write_batch_argvs_file(orig_argc, orig_argv);
 	}
 
 	if (read_batch) { /* dw */
Index: options.c
===================================================================
RCS file: /cvsroot/rsync/options.c,v
retrieving revision 1.74
diff -u -r1.74 options.c
--- options.c	11 Jan 2002 08:25:33 -0000	1.74
+++ options.c	15 Jan 2002 19:07:54 -0000
@@ -240,8 +240,8 @@
   rprintf(F,"     --log-format=FORMAT     log file transfers using specified format\n");  
   rprintf(F,"     --password-file=FILE    get password from FILE\n");
   rprintf(F,"     --bwlimit=KBPS          limit I/O bandwidth, KBytes per second\n");
-  rprintf(F," -f  --read-batch=EXT        read batch file\n");
-  rprintf(F," -F  --write-batch           write batch file\n");
+  rprintf(F,"     --read-batch=EXT        read batch file\n");
+  rprintf(F,"     --write-batch           write batch file\n");
   rprintf(F," -h, --help                  show this help screen\n");
 #ifdef INET6
   rprintf(F," -4                          prefer IPv4\n");
@@ -262,7 +262,7 @@
       OPT_LOG_FORMAT, OPT_PASSWORD_FILE, OPT_SIZE_ONLY, OPT_ADDRESS,
       OPT_DELETE_AFTER, OPT_EXISTING, OPT_MAX_DELETE, OPT_BACKUP_DIR, 
       OPT_IGNORE_ERRORS, OPT_BWLIMIT, OPT_BLOCKING_IO,
-      OPT_MODIFY_WINDOW};
+      OPT_MODIFY_WINDOW, OPT_READ_BATCH, OPT_WRITE_BATCH};
 
 static struct poptOption long_options[] = {
   /* longName, shortName, argInfo, argPtr, value, descrip, argDesc */
@@ -331,8 +331,8 @@
   {"address",          0,  POPT_ARG_STRING, &bind_address, 0},
   {"backup-dir",       0,  POPT_ARG_STRING, &backup_dir},
   {"hard-links",      'H', POPT_ARG_NONE,   &preserve_hard_links},
-  {"read-batch",      'f', POPT_ARG_STRING, &batch_ext, 'f'},
-  {"write-batch",     'F', POPT_ARG_NONE,   &write_batch, 0},
+  {"read-batch",       0,  POPT_ARG_STRING, &batch_ext, OPT_READ_BATCH},
+  {"write-batch",      0,  POPT_ARG_NONE,   &write_batch},
 #ifdef INET6
   {0,		      '4', POPT_ARG_VAL,    &default_af_hint,   AF_INET },
   {0,		      '6', POPT_ARG_VAL,    &default_af_hint,   AF_INET6 },
@@ -511,9 +511,8 @@
 			keep_partial = 1;
 			break;
 
-
-		case 'f':
-			/* The filename is stored for us by popt */
+		case OPT_READ_BATCH:
+			/* The filename is stored in batch_ext for us by popt */
 			read_batch = 1;
 			break;
 
@@ -550,7 +549,8 @@
 	static char mdelete[30];
 	static char mwindow[30];
 	static char bw[50];
-	static char fext[20]; /* dw */
+	static char fext[20];
+	static char wbatch[14];
 
 	int i, x;
 
@@ -605,8 +605,6 @@
 		argstr[x++] = 'S';
 	if (do_compression)
 		argstr[x++] = 'z';
-	if (write_batch)
-	    argstr[x++] = 'F'; /* dw */
 
 	/* this is a complete hack - blame Rusty 
 
@@ -629,8 +627,13 @@
 		args[ac++] = mdelete;
 	}    
 	
+	if (write_batch) {
+		snprintf(wbatch,sizeof(wbatch),"--write-batch");
+		args[ac++] = wbatch;
+	}
+
 	if (batch_ext != NULL) {
-		sprintf(fext,"-f%s",batch_ext);
+		snprintf(fext,sizeof(fext),"--read-batch=%s",batch_ext);
 		args[ac++] = fext;
 	}
 
Index: proto.h
===================================================================
RCS file: /cvsroot/rsync/proto.h,v
retrieving revision 1.133
diff -u -r1.133 proto.h
--- proto.h	26 Nov 2001 07:18:09 -0000	1.133
+++ proto.h	15 Jan 2002 19:07:55 -0000
@@ -9,7 +9,7 @@
 void write_batch_flist_file(char *buff, int bytes_to_write);
 void write_batch_flist_info(int flist_count, struct file_struct **fptr);
 void write_char_bufs(char *buf);
-void write_batch_argvs_file(int orig_argc, int argc, char **argv);
+void write_batch_argvs_file(int argc, char *argv[]);
 struct file_list *create_flist_from_batch();
 int read_batch_flist_file(char *buff, int len);
 unsigned char read_batch_flags();
Index: receiver.c
===================================================================
RCS file: /cvsroot/rsync/receiver.c,v
retrieving revision 1.34
diff -u -r1.34 receiver.c
--- receiver.c	11 Jan 2002 08:25:33 -0000	1.34
+++ receiver.c	15 Jan 2002 19:07:55 -0000
@@ -36,6 +36,7 @@
 extern char *compare_dest;
 extern int make_backups;
 extern char *backup_suffix;
+extern int write_batch;
 
 static struct delete_list {
 	DEV64_T dev;
@@ -357,6 +358,12 @@
 			if (!am_server) {
 				log_transfer(file, fname);
 			}
+			continue;
+		}
+
+		if (0 && write_batch) {
+			/* drain */
+			receive_data(f_in,NULL,-1,NULL,file->length);
 			continue;
 		}
 
Index: rsync.1
===================================================================
RCS file: /cvsroot/rsync/rsync.1,v
retrieving revision 1.101
diff -u -r1.101 rsync.1
--- rsync.1	18 Dec 2001 06:45:28 -0000	1.101
+++ rsync.1	15 Jan 2002 19:07:56 -0000
@@ -306,8 +306,8 @@
      --log-format=FORMAT     log file transfers using specified format
      --password-file=FILE    get password from FILE
      --bwlimit=KBPS          limit I/O bandwidth, KBytes per second
- -f, --read-batch=FILE       read batch file
- -F, --write-batch           write batch file
+     --read-batch=FILE       read batch file
+     --write-batch           write batch file
  -h, --help                  show this help screen
 
 
@@ -918,7 +918,7 @@
 \fBsrc_dir\fP
 .PP 
 .RS 
-$ rsync -F [other rsync options here] \e
+$ rsync --write-batch [other rsync options here] \e
 .br 
 /somewhere/src_dir /somewhere/target_dir
 .RE 
Index: rsync.yo
===================================================================
RCS file: /cvsroot/rsync/rsync.yo,v
retrieving revision 1.86
diff -u -r1.86 rsync.yo
--- rsync.yo	18 Dec 2001 06:45:28 -0000	1.86
+++ rsync.yo	15 Jan 2002 19:07:57 -0000
@@ -277,8 +277,8 @@
      --log-format=FORMAT     log file transfers using specified format
      --password-file=FILE    get password from FILE
      --bwlimit=KBPS          limit I/O bandwidth, KBytes per second
- -f, --read-batch=FILE       read batch file
- -F, --write-batch           write batch file
+     --read-batch=FILE       read batch file
+     --write-batch           write batch file
  -h, --help                  show this help screen
 
 
@@ -798,7 +798,7 @@
 bf(src_dir)
 
 quote(
-$ rsync -F [other rsync options here] \nl()
+$ rsync --write-batch [other rsync options here] \nl()
            /somewhere/src_dir /somewhere/target_dir
 )
 
Index: sender.c
===================================================================
RCS file: /cvsroot/rsync/sender.c,v
retrieving revision 1.13
diff -u -r1.13 sender.c
--- sender.c	31 Aug 2001 09:27:35 -0000	1.13
+++ sender.c	15 Jan 2002 19:07:57 -0000
@@ -159,13 +159,14 @@
 		initial_stats = stats;
 
 		s = receive_sums(f_in);
-		if (write_batch) /* dw */
-		    write_batch_csum_info(&i,flist->count,s);
 		if (!s) {
 			io_error = 1;
 			rprintf(FERROR,"receive_sums failed\n");
 			return;
 		}
+
+		if (write_batch)
+		    write_batch_csum_info(&i,flist->count,s);
 	  
 		if (!read_batch) {
 			fd = do_open(fname, O_RDONLY, 0);
Index: util.c
===================================================================
RCS file: /cvsroot/rsync/util.c,v
retrieving revision 1.98
diff -u -r1.98 util.c
--- util.c	15 Jan 2002 10:05:28 -0000	1.98
+++ util.c	15 Jan 2002 19:07:58 -0000
@@ -173,10 +173,7 @@
 		extern int am_sender;
 		extern int am_server;
 
-		if (read_batch)
-		    am_sender = 0;
-		else
-		    am_sender = !am_sender;
+		am_sender = read_batch ? 0 : !am_sender;
 		am_server = 1;		
 
 		if (dup2(to_child_pipe[0], STDIN_FILENO) < 0 ||

Thanks,
-- 
Jos Backus                 _/  _/_/_/        Santa Clara, CA
                          _/  _/   _/
                         _/  _/_/_/             
                    _/  _/  _/    _/
josb at cncdsl.com     _/_/   _/_/_/            use Std::Disclaimer;




More information about the rsync mailing list