[Q] multicasting product ?

HP Wei hp at rentec.com
Thu Dec 20 00:51:46 EST 2001


>So my question is: does anyone know of a product which does reliable
>multicasting? (source available would be preferred)

  At our company, we have a mrsync running for a couple of months
  now.  mrsync is to transfer files to many machines at the same
  time using UDP and multicast.  I attached at the end of this message
  the excerpt from the docs of mrync.

    If this is what you need, we can contribute this program to the
  openSource.
  
HP



+-----------------+
| MRSYNC vs RSYNC |
+-----------------+
mrsync is a utility that transfers a bunch of files 
from a master machine to multiple target machines simultaneously 
by using the multicasting capability in the UNIX system. 
The name 'mrsync' is inspired by the 
popular utility 'rsync' for synchronizing files between
two machines. However, beyond this similarity in the 
functionality, mrsync is fundamentally different from rsync
in two areas.
(1) rsync uses TCP while mrsync needs UDP in order
    to use the multicasting part of UNIX's socket communication.
    The former limits the data commuinication to one-to-one-machine
    whereas the latter allows one-to-many.
    UDP has no built in flow control. As a result,
    the major part of mrsync 
    (more precisely, the multicaster and multicatcher), 
    is devoted to synchronizing the data flow.
(2) For a given file,
    rsync transfers (optionally) only those parts in the file 
    that are different
    in the two versions on the master and the target machine.
    This saves time and is accomplished 
    by using a rolling checksum algorithm by Andrew Trigell. 
    mrsync, in contrast, transfers the whole content of a file
    to all targets in one time.

+-------------------+
| HISTORY OF MRSYNC |
+ ------------------+
The project of mrsync stemmed from the prospective necessity to transfer
many files to hundreds of machines running Linux at Renaissance 
Technologies Corp. Looking into the Open Source Community, we found
a preliminary utility codes of multicasting written by Aaron Hillegass.  
Many unsuccessful test-runs on a huge amount of data files, however,
led us to embark on an overhaul on the codes.  
Most of the following items were inherited and bug-fixed from 
the original codes.
* The low level functions that 
  interact with UNIX's multicasting sockets.
* Meta_data -- the essential info about a file which the master
  machine will first transmit to the target machines.
* Division of a file into many 'pages'.
* The idea of maintaining a missing page flag.
* The idea of a multicaster and multicatcher loop -- 
 
In this mrsync, we develop two new critical elements:
flow-control message communication conducted by the multicaster,
and a four-state page reader (processor) in the multicatcher.
The former is to synchronize the task each machine is performing.
For example, the master will not start sending 
the pages of a file unless all machines have acknowledged
the completion of openning the disk i/o for the file.
In order to accomodate these elements, the codes have been 
changed significantly from the original version.
For example, the multicatcher now never asks for slowing down.
And multicaster sends data on a file-by-file basis.
The file integrity is achieved by orchestrating the 
data flow which is closely monitored and conducted 
by the master machine.

As of today, mrsync has been in full use at Renaissance 
on a daily basis.

+----------------------+
| TYPICAL RUNNING TIME |
+----------------------+
25 minutes for a group of files whose total size amounts to 4.6Gb.
(This data is obtained from running on 5 SUN machines 
 with Solaris 8 on an Ethernet LAN whose bandwidth is 1Gbits/sec.)  
   

  





More information about the rsync mailing list