Non-determinism

Martin Pool mbp at samba.org
Wed Apr 17 18:24:01 EST 2002


On 17 Apr 2002, Berend Tober <btober at computer.org> wrote:


> So while the software algorithm of ftp and cp are deterministic,
> there must be some quantifiable probablity of failure
> non-the-less. The difference with rsync is that not only are the
> same effects of data corruption at work as with ftp and cp, but the
> algorithm itself introduces non- determinism.

You are using the word "non-deterministic" in a way at odds with its
usual meaning in computer science.

> 1 definition found

> From The Free On-line Dictionary of Computing (13 Mar 01) [foldoc]:

>   deterministic
  
>      1. <probability> Describes a system whose time evolution can
>      be predicted exactly.
  
>      Contrast {probabilistic}.
  
The execution and output of the rsync algorithm can be exactly
predicted from its input.  It is a deterministic algorithm.

As the documentation points out, and somebody on this list mentioned,
rsync follows the transmission of each file with an MD4 checksum.
This protects against errors in the rsync program itself, and also
gives some protection against network or hardware errors, though
neither of these can be absolute in any program.

The protection of checking the whole file at the end of transmission
is much better than rcp, ftp, or http, which assume the transport is
error free.

> I still think rsync as in incredible tool, however, despite me 
> expression of this reservation.

What part of "astronomical" don't you understand?  

I don't normally like people quoting large numbers, but in the
particular case of MD4 I think it is justified.  

To put it in simple language, the probability of an file transmission
error being undetected by MD4 message digest is believed to be
approximately one in one thousand million million million million
million million.  

-- 
Martin 




More information about the rsync mailing list