Incremental file-list recursion has landed in CVS;
Re: RSYNC + iNotify
buckh at pobox.com
Sun Jul 29 17:47:54 GMT 2007
On Fri, Jan 12, 2007 at 08:12:30AM +0000, Wayne Davison wrote:
> On Thu, Jan 11, 2007 at 05:22:55PM -0500, Matt McCutchen wrote:
> > Specifically, I'm curious about what areas under the source
> > argument(s) are scanned at what time.
> All the args that the user supplies are scanned at once, allowing them
> to be unduplicated as they would be in a normal transfer. The only
> difference is that no recursing happens during the initial sending of
> the first file-list. Then, one directory at a time is scanned and sent
> over until we have a decent number of upcoming files in the pipeline for
> the generator. The result is that typical transfers with a small number
> of source args can start transferring files almost immediately, and the
> depth-first scan of directories continues intermixed in with the file
> transfers (or at least intermixed with the generator's scanning for
> changed files when no transfers are needed).
> If the --relative option is used, implied directories are treated as
> "args" and sent in the first wave. The --files-from option has always
> treated the items read from the file as args, so a transfer with a huge
> number of files-from items and no real recursion doesn't get any benefit
> from the incremental recursion.
> > Also, does the incremental scan rule out "file has vanished" warnings?
> It lessens their chance of occurring because the time that elapses
> between a directory scan and the time the generator starts to work on
> those files is much shorter than waiting for the full scan to complete.
> However, there can still be vanished files as there is still some
> reading ahead of directories (rsync tries to keep a good amount of work
> in the pipeline for the generator to blaze through). I haven't decided
> exactly what I want the read-ahead limit to be, but the current code
> wants 1000 files to be available beyond the currently-active directory.
i may be reading the code incorrectly, but it seems that, if the
--files-from option processing can be altered (or perhaps yet another
option could be created [shudder]) to opt out of the de-duplicate pass
and somebody hooked inotifywait
to the standard input of
rsync -r --incremental-dir --files-from=- ...
(and inotifywait can be convinced to fflush() after printing each event
and somebody also took appropriate precautions to de-duplicate entries
within a reasonable time frame etc. etc.) then you'd have sorta like
what, even further back,
On Wed, Feb 8, 2006 at 06:12:57PM +0000, Dag Wieers wrote:
> On Tue, 31 Jan 2006, Ryan Kather wrote:
> > I'm looking for a way to continually monitor at least one but possibly
> > multiple directories (and/or individual files). I would like RSYNC to
> > immediately synchronize the changes to said directory(ies) after they
> > occur. I believe the best approach for this would be to utilize
> > iNotify
> > enabled kernels and create a plugin for the RSYNC daemon.
> > However, before I begin the task of actually writing some code (with
> > my
> > poor abilities), I thought I would inquire if anyone else has already
> > created this or something similar? Am I over thinking this, or is
> > there
> > a better approach? Is there a reason not to do this?
> I'm very interested in functionality like this. I remember it being
> brought up on this list before so I would look for similar mails in the
> archive for clues.
> How to do it efficiently (eg. for files in transit/still open), I don't
> know. Also it seems to me that you may want a seperate daemon that
> implements the rsync protocol itself (instead of relaying on an external
> tool) as that allows you to optimize certain things and have less
> I'm most interested in writing this in python, using a python-rsync
> implementation and python-inotify.
> Kind regards,
> -- dag wieers,
sorry if this is wacko or retraces ground already covered on the list
(haven't been paying attention for a while, since rsync does everything
i could possibly need--except, of course, just this 1 more thing, which
i disclaim any liability for proposing, if the camel's back should break,
security- or otherwise) or otherwise amounts to a waste of time, but the
possibility of continuous rsync-ing, without you having to make but a few
concessions in your code, seems like it might be worth making an idiot out
of myself (a NOP) to suggest. (on Linux, i mean, though BSD must have
similar kqueue/kevent tools available, i'd suppose)
but if i just missed the news that somebody's found a good way to do it
in the mean time, i'd be happy to hear about it. (google is keeping mum,
thanks again for all your work on this, and a last question: does any
command have as many command-line options as rsync? are there as many
atoms in the universe as combinations of rsync options? would adding
a --sequential-files-from get you there?
More information about the rsync