filelist calculation algoritm

jw schultz jw at pegasys.ws
Sun Jan 5 19:56:00 EST 2003


On Sun, Jan 05, 2003 at 09:55:50AM -0800, Wayne Davison wrote:
> On Sat, Jan 04, 2003 at 05:03:02PM -0800, jw schultz wrote:
> > that would produce destloc/srcdir/....
> > when you might want a copy of srcdir at destloc instead of
> > in destloc.
> 
> Ah yes, I _was_ missing something.  However, I still don't think we need
> to clutter rsync with two types of --file-list options.  This is already
> something that people have to deal with when using the --relative option:
> how to generate a file list that contains just the path information that
> we need to be significant.  I think that the removal of the undesired
> prefixes should happen before the list gets to rsync rather than having
> rsync do it (in your example the user would just chdir into "srcdir" and
> do the "find" relative to '.').

I do agree that we could reasonably require that the file
list be relative to the specified source location.  Even in
a pipeline the user could pass it through sed.  It just
seemed to me that an auto s/^prefix// option would cost very
little and save on complaints.  It could be added later if
needed, perhaps as --file-list-prefixed whith the
initial/default being --file-list.

> Here's an alternative to the syntax you suggested.  I was thinking that
> it would be nice to just read filenames from stdin and have them be
> treated the same way as command-line args.  One way to indicate this
> would be to specify '-' as a name to transfer, which would tell rsync to
> read filenames from stdin.  Like this:
> 
>     rsync -av --relative - destloc <input-file
> 
> What would need to change in the protocol is that the list of filename
> args would need to get sent over the rsync connection rather than being
> sent as part of the remote-shell+rsync command-line.  Other than that
> (and the code to actually slurp the filenames from stdin) no other
> changes to rsync would be needed, so it should be pretty easy to
> implement.

That was my initial idea for doing it. 

The first problem is this would flatten things unless you
used relative and forced the user's CWD.  That would cause
considerable confusion.  Secondly, how would you do it when
the source location is remote?  Many of the users asking for
this are doing pulls.

> FYI, the old rsync release that had a type of file-list functionality
> was using a specialized include/exclude list.  I believe that rsync
> still walked the entire directory tree on both sides, and applied the
> includes using a slightly different algorithm than the default (one that
> did not require parent directories to be mentioned to get down to all
> the specified files).  I think that it would be nice to avoid the
> directory-tree traversal, so I don't think we want to go this route.
> However, this is another potential implementation method (and one that
> would result in a syntax that is like what you suggested:  one that uses
> a single source dir on the command-line and doesn't require the use of
> the --relative option).

One of the big advantages of the current behavior of the
pattern matching is that if a directory is excluded and not
included we don't traverse it in the tree walk.  Allowing
implicit directory inclusion nullifies this as soon as you
hit "/**/" in a pattern.  Even a "/*/" is a big monkey
wrench.  If i see the issue come up again i might toss a
filter out that takes care of the simple case.  That is also
a completely separate issue from this thread.



-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list