Any change of rsync using threads instead of fork?

Nelson H. F. Beebe beebe at math.utah.edu
Fri Dec 9 00:43:04 GMT 2005


List traffic today asks about changing rsync to use lightweight
threads instead of heavyweight fork.

Before rushing into building a threads version of rsync, please READ
this recent article (its author is also the co-author of the
well-known, and widely used, gc (garbage collecting version of C
malloc and C++ new) library: see

	http://www.hpl.hp.com/personal/Hans_Boehm/gc/index.html

):

@String{j-SIGPLAN               = "ACM SIG{\-}PLAN Notices"}

@Article{Boehm:2005:TCI,
  author =       "Hans-J. Boehm",
  title =        "Threads cannot be implemented as a library",
  journal =      j-SIGPLAN,
  volume =       "40",
  number =       "6",
  pages =        "261--268",
  month =        jun,
  year =         "2005",
  CODEN =        "SINODQ",
  DOI =          "http://doi.acm.org/10.1145/1065010.1065042",
  ISSN =         "0362-1340",
  bibdate =      "Tue Jun 21 17:04:05 MDT 2005",
  bibsource =    "http://portal.acm.org/",
  acknowledgement = ack-nhfb,
  abstract =     "In many environments, multi-threaded code is written
                 in a language that was originally designed without
                 thread support (e.g. C), to which a library of
                 threading primitives was subsequently added. There
                 appears to be a general understanding that this is not
                 the right approach. We provide specific arguments that
                 a pure library approach, in which the compiler is
                 designed independently of threading issues, cannot
                 guarantee correctness of the resulting code. We first
                 review why the approach almost works, and then examine
                 some of the surprising behavior it may entail. We
                 further illustrate that there are very simple cases in
                 which a pure library-based approach seems incapable of
                 expressing an efficient parallel algorithm. Our
                 discussion takes place in the context of C with
                 Pthreads, since it is commonly used, reasonably well
                 specified, and does not attempt to ensure type-safety,
                 which would entail even stronger constraints. The
                 issues we raise are not specific to that context.",
  remark =       "This is an important paper: it shows that current
                 languages cannot be reliable for threaded programming
                 without language changes that prevent compiler
                 optimizations from foiling synchronization methods and
                 memory barriers. The article's author and others are
                 collaborating on a proposal for changes to the C++
                 language to remedy this, but that still leaves threads
                 unreliable in C code, even with POSIX threads.",
}

The gist of the paper is that your threaded program may appear to
work, and even work apparently correctly most of the time.  However,
there are going to be nasty race conditions that from time to time
cause unpredictable, and unreproducible, behavior.  Most of us would
rather not have to deal with that mess in production software (e.g.,
at my site, rsync is a critical component in nightly updates of about
100 systems running about 20 different flavors of Unix serving
thousands of users).

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: beebe at math.utah.edu  -
- 155 S 1400 E RM 233                       beebe at acm.org  beebe at computer.org -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------


More information about the rsync mailing list