[PATCH] make write_batch local

Chris Shoemaker c.shoemaker at cox.net
Sat Jun 19 22:36:50 GMT 2004

On Fri, Jun 18, 2004 at 10:20:57AM -0700, Wayne Davison wrote:
> On Wed, Jun 16, 2004 at 07:09:46PM -0400, Chris Shoemaker wrote:
> > I hope you have the time to review this patch and comment.
> The patch looks good on first inspection.  I don't like the change to
> the whole-file default, though -- I'd prefer rsync to not force the
> --whole-file option in batch mode (since we want batch files to be
> efficient by default).

Ah, I see your point.  I think I looked at that disable_delta_p() function
50 times, and had to re-parse it every single time.  And even though the
comments are correct, they didn't help.  I think my brain can only keep
track of like 3 branches or negations at a time, after that, they fall
off the stack.  

Off the top of my head, I can't see a clear way to simplify this code.
The complexity comes from wanting to change the default behavior based
on batch-mode (which is evident from the option args) and pure-locality
(which is only evident after src/dest have been parsed out), while still
respecting overriding --{no-}whole-file options.  Any ideas?

> Also, this comment in pipe.c looks like you meant to say "server" rather
> than "sender" in the second instance, correct?

Hmm, I thought I meant sender, but maybe you can explain why you think
server makes more sense.  My thinking was that the child forked in this
function becomes the "sender".  I know that this child will call
start_server(), but I think it will then always call do_server_sender().
In my mind, "sender" is more specific than "server", since (in general,
even if not in this case) servers can be receivers.  Perhaps
"sender-server" is clearest?

The over-arching concern should probably be consistency in the use of
terms in the comments in the entire source tree, and I'm not sure my
usage of these terms agrees with that, as I've only skimmed most
functions.  What do you think?

> >   		am_server = 1;
> >   
> > + 		/* There is write_batch code on both the receiver and
> > + 		 * sender sides.  In local_child, both are local processes,
> > + 		 * so we must make sure that only one actually writes.  It 
> > + 		 * shouldn't matter which one -- here we prevent sender from
> > + 		 * writing. */
> > + 		write_batch = 0; 
> > + 		
> >   		if (!am_sender) 
> >   			filesfrom_fd = -1;
> In your comment on your testing between 2.5.7 and your patched version, you
> said:
> > All four methods produce identical batch files.
> I assume you mean excepting the file-list file, which changed radically
> in 2.6.1 (the old code used to write out its own file-list data, but the
> new code uses the actual file-list- sending code to output this data,
> making it much easier to maintain).

Yes, I finished the flist stuff first, because it was the easiest.  I
wasn't testing as rigorously at that time.  It was the difficulties of
the deltas and csums that actually required me to look at the 2.5.7 case
for comparison and referrance.  BTW, 2.6.1 produces flist files that are
noticably smaller than 2.5.7.

> > Regarding next steps, I've come to believe that the entire batch mode
> > code is too invasive into too many parts of the rest of the code.
> Yes, I think I'm inclined to agree with you.
> > If you're willing to break backward compatibility of existing batch
> > files,
> I am.  The code is marked as experimental, and I don't see a great need
> for people to keep batch files for long periods of time, so this
> shouldn't cause people a problem.
> > I think the batch mode code can be significantly simplified and made
> > more maintainable and flexible.  Basically, I think batch mode should
> > just record whatever hits the socket.  I don't see much benefit to
> > splitting up the data into 3 files, essentially creating a batch-mode
> > specific protocol on the disk, with special reading and writing
> > functions.  Simplify, simplify.
> That would be much better.  In fact, that could even be made a separate
> helper app since it would be possible to have an app record the data and
> forward it.  Then, the --rsh and/or --rsync-path option could be used to
> have rsync talk to a batch-replay program instead of a real rsync.  The
> one complicating factor I can see is that there are some differences
> between a daemon stream and a normal remote-shell stream, but it should
> be possible do a different batch-replay command depending on what stream
> was recorded (e.g. daemon syntax using a custom program via the --rsh
> option).  I haven't fully thought through if this external program method
> would be that much better than a simplified internal batch mode, though,
> so feel free to comment on any ramifications I haven't thought of.

Let me see what I come up with.

> I've checked-in some of your comment tweaks that were not specific to
> batch changes.  I then checked-in my version of the remainder of your
> patch into the patches dir (named local-batch.diff).  Just do a cvs
> update to check it out.

Thanks.  BTW, what's an easy way for me to separate out unrelated bits
from the main patch.  I knew they were there and would have done it for
you if I knew a better way.  One obvious way is manually editting the
diff file, but I was a little wary of making a mistake.  Is that how
you'd do it?

> ..wayne..

More information about the rsync mailing list