[Bug 9744] Support Git, Mercurial, Subversion ignore lists

samba-bugs at samba.org samba-bugs at samba.org
Wed Mar 27 16:23:43 MDT 2013


https://bugzilla.samba.org/show_bug.cgi?id=9744

--- Comment #3 from Brian K. White <brian at aljex.com> 2013-03-27 22:23:42 UTC ---
(In reply to comment #2)
> A filter coprocess would suffice, I think; either accepting or rejecting
> individual files, or emitting rsync-format patterns. For performance reasons it
> would probably not work well to fork a new process for each check.

No of course not. It would indeed be a single continuous co-process.
If the source ignore info is simple to parse (no regex) then use any language
you like to just read in the the ignore list into an array and check each input
filename against it. If the souce ignor info includes regex, then just use a
language that has regex built in, anything from awk, perl, whatever, anything
so you are not calling sed a zillion times.

> Building the file list up front has the disadvantage that you would have to
> make a complete pass over the (source) directory tree and result could be
> enormous (I hope rsync is quick at skipping over non-matching absolute
> pathnames). On the other hand if you _did_ need to fork some subcommands to get
> information, it would be more straightforward to minimize the number of such
> forks needed, e.g. by running `svn pg -R --xml svn:ignore` once in the topmost
> dir to contain a .svn subdir.

Maybe the way to go is to read the ignore info and just translate it into find
syntax, regex and all, then use find to do the work of actually walking the
tree and generating the file list efficiently. Then pipe that right into
"|rsync --files-from=-"

Remember to use find -print0 and rsync -0

Done. No huge temp file and no wasted double-scan through something possibly
huge and the tree-walk, including skipping stuff you should know you can skip,
is however efficient "find" is.

-- 
bkw

-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


More information about the rsync mailing list