directory replication between two servers

jw schultz jw at pegasys.ws
Wed Jul 3 17:41:02 EST 2002


On Wed, Jul 03, 2002 at 11:10:13AM -0700, Eric Ziegast wrote:
> > I am two Linux servers with rsync server running on both. Now I am
> > replicating directories in both servers with the command rsync -avz ....
> > My requirement is, if I made any changs in the first server, say server
> > A,   I want to see the changes in the scond server immediately....some
> > thing similar to mysql database replication....how can I do that..??

You have said what you want to see.  What is missing is why.
Why do you want this on two servers.  Is it for redundancy,
network partitioning, or scaling?  Is this on a LAN, WAN or
internet?  The best approach depends on the reason and
context.


> 
> ... a vague question.  It depends on the application.
> 
> In high-avilability environments it's best to do the replication in the
> application so that the application can deal with or work around any
> failure conditions.  In the case of a database, database replication
> methods work better than depending on the filesystem.  The filesystem does
> not know the state of transactions within the database.

Some (veritas) have database hooks but even then it is just a
failover facility that requires log-replay.  Also, doing
replication at the application level often allows the slave
node to be used for some real work (report generation,
data mineing, etc)

> If you need read-write access on one server and need to replicate data
> to a read-only server and need synchronous operation (i.e.: the
> write must be completed on the remote server before returning to the
> local server), then you need operating-system-level or storage-level
> replication products.
> 
>     Veritas:
> 	It's not available on Linux yet, but Volume Replicator performs
> 	block-level incremental copies to keep two OS-level filesystems
> 	in sync.  $$
> 
> 	File Replicator is based (interestingly enough) on rsync, and
> 	runs under a virtual filesystem layer.  It is only as reliable
> 	as a network-wide NFS mount, though.  (I haven't seen it used
> 	much on a WAN.)  $$
> 
>     Andrew File System (AFS)
> 	This advanced filesystem has methods for replication
> 	built in, but have a high learning curve for making them
> 	work well.  I don't see support for Linux, though. $
> 
>     Distributed File System (DFS)
> 	Works alot like AFS, built for DCE clusters, commercially
> 	supported (for Linux too)  $$$
> 
>     NetApp, Procom (et.al.):
> 	Several network-attached-storage providers have replication
> 	methods built into their products.  The remote side is kept
> 	up to date, but integrity of the remote data depends on the
> 	application's use of snapshots.  $$$
> 
>     EMC, Compaq, Hitachi (et.al.):
> 	Storage companies have replication methods and best practices
> 	built into their block-level storage products.   $$$$
> 
> 
> If others know of other replication methods or distributed filesystem
> work, feel free to chime in.

	NFS
		A filesystem level sharing over the network.
		Don't pooh-pooh NFS because it is old.  I
		don't recommend it on an unsecured network
		but it is suprisingly fast.  Given a fast
		network Netledger found Oracle ran faster on a 
		NFS mounted volumes than on small local disks.
		The linux NFS server does need some
		performance improvement.  Not suitable for
		WAN.

	Coda
		A distributed filesystem Based on research from AFS.
		Single tree structure that lives as an alien
		in the unix tree.  Primary focus is
		disconnected operation. Lacks locking so even
		when all nodes are online can have update
		conflicts.  Available on linux, is FREE.

	Intermezzo
		A distributed filesystem Based on research
		from Coda.  Seems less alien than than Coda
		with better support for multiple
		mountpoints. Provides locking mechanisms for
		connected operations but still allows
		resyncronization on reconnect.  Developed on
		Linux, is FREE.

	Lustre
		A cluster filesystem can be used with
		multiport disks, SAN devices and xNDB.
		Filesystem is online, writable for all
		nodes.  Storage device is responsible for
		HA.  Still in developement.

If you look cluster websites you will probably find a few
more solutions.
		

	

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt




More information about the rsync mailing list