Announce: Tool for replicating filesystem changes

Mon Jun 3 11:12:41 MDT 2013

Hello rsync-ers,

Announcing 'should', a GPLv3 tool for recording and reacting to
filesystem events that has synergies with and differences from common
rsync use cases:

   https://github.com/gladserv/should

Written by Claudio, it is in use at Gladserv (a hosting company,
http://gladserv.com/ ) and The PODFather (a SaaS provider,
http://thepodfather.com ) and a few other places. This is a call for some
more general testing.

The current 'should' design uses an efficient and careful algorithm
around inotify on the collection side and a batching/replay system to
implement other functionality, including optionally linking to librsync.
Additional technologies may be implemented in the future, perhaps as
discussed below.

Comparison
----------

rsync is known to be highly reliable after years of testing. 'should' is not.

rsync saves network bandwidth. 'should' saves disk bandwidth as
well as network bandwidth.

rsync cannot reliably and quickly detect subtree renames. 'should' does both.

rsync doesn't care when a file is modified, only that there is a
difference between two files. 'should' detects filesystem events in
real-time, which may (or may not) be applied to a similar filesystem
elsewhere.

rsync on a live filesystem is lossy due to the time it takes to traverse
a tree. 'should' watches the whole tree at once and records all changes.

rsync scales to an arbitary number of changes between filesystems. 'should' can potentially
be overwhelmed by too many changes happening too quickly in real-time,
although the total size of the filesystem or total number of changes is probably 
not a concern.

rsync has one main use: file copying. 'should' can be used for many
things, including monitoring, notifying, mirroring, snapshotting and more.

rsync can optionally batch changes for later replay (--write-batch),
although rarely used. 'should' always batches changes, although the
replay can take place immediately.

rsync can deal with all Unix file-like objects such as fifos, devices
files etc. 'should' can too.

rsync knows about filesystem boundaries. So does 'should'.

rsync cannot mirror live filesystems, there will always be a lag and
potentially unresolvable files due to them constantly changing. 'should'
is designed to mirror live filesystems, and may handle the
constantly-changing case fairly well (do let us know :)

rsync knows about ACLs. 'should' does not.

rsync has its own protocol, a simple security system and can run over
any transparent shell such as ssh. So does 'should'.

rsync cannot connect over SSL (although it can be tunneled [1]). 'should' uses SSL by default.

rsync is ~88k LOC. 'should' is ~26k LOC.

What 'should' Consists of
-------------------------

'should' consists of a C utility (various modes including client and
server), a library, and a feature-complete Perl module as an example of
language bindings. The program tries very hard to avoid the traditional
issues of a kernel API like inotify, including: not handling
subdirectories recursively; race conditions (what happens if you mkdir
or rmdir a watched directory, many times a second?); how do you store
and catch up events if the kernel has been too busy; how do you prevent
event overruns etc.  Optionally, differential rsync file transfer can be
used to replicate events where they affect large files. 

Documentation is complete and Unix-like, although currently more suited
to readers of this list than first-time users. In some cases (eg "what
exactly is the dirsync option?") you also need to read the protocol
summary.

There is one corner case where 'should' is expected to work really well,
which is replication of email, especially Maildirs.  'should' also works
with multiple replicants and cascading replication architectures.  Some
experiments have been done to implement multimaster in the general case
- to the extent that multimaster is useful, and there are theoretical
questions that this raises - but none of these are the git tree. In the
Maildir-and-similar case, multimaster probably works reliably (ie
multiple inbound MTA and multiple imapd servers) due to the way Maildir
works in theory, but nobody is promising 'should' will definitely work
in this configuration. 'should' has had basic testing in a ring
architecture, it might work but at least is not expected to deadlock,
resonate or explode.

Problems
--------

* inotify is Linux-specific, although 'should' portability to the existing inotify-like APIs 
  on most mainstream operating systems is very possible and expected.

* The maximum useful value for /proc/sys/fs/inotify/max_user_watches is 32768 .
  Which is quickly exceeded on any moderate Maildir deployment.

* inotify has no concept of byteranges to indicate where a modification
  happened.

Architecture Discussion
-----------------------

* inotify is about extending specific Linux filesystems. But replication
  ought to be as filesystem-independent as possible as well as
  independent of operating sytems and versions, so inotify is not ideal.
  However inotify is a wonderful out-of-the-box simple solution to a lot
  of common problems. If you are operating at scale you need to think
  carefully about the architecture no matter what you use.

* dtrace, by contrast, knows both about changes and byteranges, and 
  is independent of filesystems and is already supported on many 
  operatingsystems (FreeBSD, NetBSD, Solaris-and-derivatives, Linux, more?)
  There are other important differences, for example dtrace doesn't lock a 
  filesystem but inotify does.

* therefore, one design improvement would be to use dtrace instead of
  inotify and dtrace information instead of the optional rsync
  algorithm. Patches welcome.

* another design improvement would be to extract the data to be shipped 
  by using 'zfs snapshot ...' commands, or 'btrfs send ...' to the
  extent that btrfs works today. But that brings us back to filesystem
  and operating system dependencies again. This could be an interesting
  extension, and could work with either inotify or dtrace triggers.

So do please try out 'should', see if it works for you, or if not then
consider submitting a bug report and/or patches as per the procedure in
the man page.

This should work.

Regards,

--
Dan Shearer
dan at shearer.org

[1] http://www.bytemark.co.uk/support/technical_documents/backuprsyncssl