directory replication between two servers
Olivier Tarnus
olivier.tarnus at skynet.be
Mon Jul 8 14:49:10 EST 2002
On Wednesday 03 July 2002 20:10, Eric Ziegast wrote:
> > I am two Linux servers with rsync server running on both. Now I am
> > replicating directories in both servers with the command rsync -avz ....
> > My requirement is, if I made any changs in the first server, say server
> > A, I want to see the changes in the scond server immediately....some
> > thing similar to mysql database replication....how can I do that..??
>
> ... a vague question. It depends on the application.
>
> In high-avilability environments it's best to do the replication in the
> application so that the application can deal with or work around any
> failure conditions. In the case of a database, database replication
> methods work better than depending on the filesystem. The filesystem does
> not know the state of transactions within the database.
>
> Imagine this: Instead of having your client application write to one
> filesystem, have it write to two filesystems before saying the write
> was completed or committed. If one system fails, the other is updated
> just as well as the failed filesystem (caveat: I'm ignoring race
> conditions!).
>
>
> If you need read-write access on both local and remote servers and have
> partitioned data sets (i.e. don't need to depend on block-level locking),
> consider having both servers use a dedicated high-availability network
> attached storage server (HA solution). Both can access an NFS server,
> or the second server can mount the filesystem from the first server (not
> an HA solution).
>
>
> If you need read-write access on one server and need to replicate data
> to a read-only server _and_ if the replicaiton process can be asynchronous,
> doing multiple rsyncs can work.
>
> while true
> do
> rsync -avz source destination
> if [ $? != 0 ]; then
> Get Help
> fi
> done
>
> If you know where your applications are doing writes, you might limit
> your replication to the subdirectory or files on which writes are
> performed to help speed up the search process. Note, though, that
> rsync-based replicaiton methods are not efficient on the disks or
> filesystems, just the network traffic. Imagine reading _all_ of your
> data over and over and over and over again when only a few blocks might
> change periodically.
>
>
> If you need read-write access on one server and need to replicate data
> to a read-only server and need synchronous operation (i.e.: the
> write must be completed on the remote server before returning to the
> local server), then you need operating-system-level or storage-level
> replication products.
>
> Veritas:
> It's not available on Linux yet, but Volume Replicator performs
> block-level incremental copies to keep two OS-level filesystems
> in sync. $$
>
> File Replicator is based (interestingly enough) on rsync, and
> runs under a virtual filesystem layer. It is only as reliable
> as a network-wide NFS mount, though. (I haven't seen it used
> much on a WAN.) $$
>
> Andrew File System (AFS)
> This advanced filesystem has methods for replication
> built in, but have a high learning curve for making them
> work well. I don't see support for Linux, though. $
>
> Distributed File System (DFS)
> Works alot like AFS, built for DCE clusters, commercially
> supported (for Linux too) $$$
>
> NetApp, Procom (et.al.):
> Several network-attached-storage providers have replication
> methods built into their products. The remote side is kept
> up to date, but integrity of the remote data depends on the
> application's use of snapshots. $$$
>
> EMC, Compaq, Hitachi (et.al.):
> Storage companies have replication methods and best practices
> built into their block-level storage products. $$$$
>
If your problem is just the automatic launch of synchronization, you
should take a look at fam and imon (http://oss.sgi.com/projects/).
This 2 extensions for the linux kernel provide a way of triggering
actions on file and inode alterations.
This is a good way of starting a replication/copy if you don't have too
many concurrent writes. But i know that this extension are not the
best security practices (You can do many thing by triggering actions
on file alteration :-)
>
> Another alternative (cheaper, too) is to just use a database, period.
> People who worry about data storage, data integrity, failover, and
> replication have put alot of thought into their database products.
> If you can modify your application to depend on a database and not
> a filesystem, you may be better off in the long run. Lazy people use
> filesystems as their database. It works just fine up to the point
> where you need to worry about real-time replication.
>
> Again, it really depends on the application.
>
> If others know of other replication methods or distributed filesystem
> work, feel free to chime in.
More information about the rsync
mailing list