batch-mode fixes [was: [PATCH] fix read-batch SEGFAULT]

Sat May 15 15:53:05 GMT 2004

On Sun, May 09, 2004 at 08:50:07AM -0700, Wayne Davison wrote:
> On Fri, May 07, 2004 at 06:54:32PM -0400, Chris Shoemaker wrote:
> > /* if (!write_batch) */
> >    send_exclude_list(f_out);
> > 
> > at main.c:641.
> > 
> > This seems to work better, because things get further.
> 
> I looked at the receiving side, and I think that code should probably
> be "if (!read_batch)" instead of "if (!write_batch)".

  That makes some sense.  I'll go with that.

> 
> > Things still don't complete, though.  Now, I'm getting permission
> > denied errors when opening the file pfx.rsync_flist.  I'm not sure why. 
> 
> The pfx* files are created with mode 600, so try "chmod 644 pfx*".
> 
> ..wayne..

  I wasn't very clear about that.  What I meant was that rsync was failing 
during the --write-batch because of a permission denied when opening 
pfx.rsync_flist.  I've looked into this a bit more now, but I'm still confused 
about one major thing in particular -- Who should be writing out the batch 
files?
  My intuition is that if I run a rsync client with --write-batch and a remote 
rsyncd source and a local destination, then my client should handle all the 
writing of batch files.  Indeed, I'd be a little surprised to learn that the 
server even cares if I write_batch or not, since its behavior should be the 
same either way.
  However, this is not the case at all.  Not only does the client pass 
the "--write-batch=pfx" argument to the server, but it's actually the server 
calling all the batch write routines.  See for example in send_file_list():

		/* Now send the uid/gid list. This was introduced in
		 * protocol version 15 */
		send_uid_list(f);

		/* send the io_error flag */
		write_int(f, lp_ignore_errors(module_id) ? 0 : io_error);

		io_end_buffering();
		stats.flist_size = stats.total_written - start_write;
		stats.num_files = flist->count;
		if (write_batch) /* CAS: not sure 'bout this. */
			write_batch_flist_info(flist->count, flist->files);

and similar calls in match_sums(), simple_send_token(), and send_files().  All 
of these calls are made from the _server_  (am_server==1).  Since the server 
is setuid(nobody), of course each of these calls fail with permission denied 
errors.  Incidentally, the pfx.rsync_argvs file is created by client.  See in 
main():

if (write_batch && !am_server) {
	write_batch_argvs_file(orig_argc, orig_argv);
}

  Well, I don't want the remote server to be writing out my batch files, when
I want them locally.  I replaced "if (write_batch)" with "if (write_batch && 
!am_server)" is six places.  (Maybe something like a one-time "if (write_batch 
&& am_server) write_batch=0;" is better.)

  With that change, for the first time, I could run:
> rsync --write-batch=pfx rsync://localhost/test ./A
without hanging!

  BUT, it doesn't write out 3 of the 4 batch files either. :(  That's how I 
learned that the client doesn't handle the batch-file writing.

  From what I understand, the path ahead is to move all the write_batch tests 
and calls to the client side, and off the server-side.  So, does anyone out 
there think I'm on the right (or wrong) track?

  Oh, and AFAICS, the unfortunate way to handle legacy broken servers is for 
the client to suppress passing the --write-batch arg, otherwise they'll still 
hang, even with new clients.

  -Chris