Thread performance (was Re: dynamic context transitions)

Mon Dec 6 00:39:58 GMT 2004

People are putting a lot of effort into this and it really doesn't make 
any sense to me since I'm not arguing the points that keep being brought 
up.  I'm just enjoying the learning experience.

On Sun, Dec 05, 2004 at 06:40:10PM -0500, John E. Malmberg wrote:
> Christopher R. Hertel wrote:
> >
> >So, in the context in which we are generally working, I see that it is 
> >provable that threads are slower than processes.  No problem.  I see your 
> >logic and it makes perfect sense to me.
> 
> It is provable in some cases either way.

Read Gödel, Escher, Bach: An Eternal Golden Braid by Douglas R.  
Hofstadter and then get back to me.  :)

> For memory and resource protection, and protection from software bugs, 
> separate processes are going to be better.

Yes and no.  Memory and resource protection doesn't protect you from
writing software bugs, it just protects you from some of the ill effects
of such bugs.  You can actually hide bugs that way.  A buffer overrun on a
system with no memory protection has quicker and more disasterous effects,
which are clear and painful.  It's a lot of fun watching your code write 
over some other processes stack, let me tell you.  :)

It tends to cause the coders to fix the problem quickly, often before the
errant code goes out the proverbial door.

Not that I'm recommending that sort of system for general use, mind you.  

> Posix Threads have another disadvantage that has not yet been mentioned. 
> Limited stack space as compared to the process model.

Yep.  I'm aware of that one too.  

> This limited stack space puts a limit on the scaling of the Posix Thread 
> based solution, unless other tricks are used.

You're assuming that I would write the same code to run within a thread
that I would write to run within a process.  That's true for the scenario
Tridge is discussing (I believe that everything he's said is true within
that context...I keep repeating that I'm not arguing but folks keep
wanting to argue anyway).  :)

> The process based model also will put some limits on the scaling of the 
> software for a specific machine.

There are a lot of things which clearly fit better into the process model.

So?

I'm curious, from a purely academic perspective, about the kinds of thread
models that my be employed and what kinds of code might work well within
those models.  I keep reminding myself that a process is a kind of thread.

I'm not tied to the Unix models here.  Those have changed over the years 
for good reason, as people worked out what made sense and what didn't--for 
Unix.

> Note that Microsoft does not appear to use either a Posix Thread 
> environment or a per-process environment as near as can be determined by 
> looking at their systems with the monitoring tools provided by them.

They're using different models.  That's okay (though I probably wouldn't
be interested in using those models myself).  ;)  ;)

> >The game I am playing in my mind is one of turning the constants into
> >variables to see what happens.  The definition of a "proccess" and a
> >"thread" has changed over time.  So have CPUs and Operating Systems.  All
> >of those things are variables.
> >
> >This excercise probably doesn't have much practical value.  I'm okay with 
> >that.  I've got plenty of practical things to do.  :)  My brain gets tired 
> >of practical things some times and I just like to "think" a bit to see 
> >what else occurs to me.
> 
> It may have a practical value.  Both Posix threads and processes have 
> overhead that it may be desirable to avoid.

Okay...  now I think we're on the same page.  I think it's practical to 
understand these models and the plusses and minuses of their various 
aspects.

> An asynchronous I/O model allows you to write code using a threaded 
> model, yet does not have the overhead of either the process model or the 
> Posix thread model.
> 
> Both the Posix Thread model and the asynchronous I/O model have the 
> security issue with SAMBA as SAMBA has to have the security context of 
> the sending process.

Right, but Samba is the practical piece I'm not really considering.  
Tridge did some wonderful stuff in Samba 4 making it possible to run it 
under either a (posix) threaded or process model.  Empirically, he showed 
that Samba 4 is much faster under the process model.  His analysis of 
_why_ that's the case seems quite sound to me.

> OpenVMS Alpha and IA64 for version 7.3 and later allow setting the 
> security context on a per thread basis.

Geez it's been a long time since I wrote code for VMS...  :)

> >I'm working under the assumption that there are alternate configurations
> >in which using threads would be faster than using processes.  I'm trying
> >to figure out what would need to change to make that the case.  Possibly
> >the definition of a process, possibly the definition of a thread, possibly
> >the underlying assumptions built into the OS, and possibly the kinds of
> >things you would do with threads when writing an application.  Maybe some
> >of each.  There's really no harm in considering the problem...
> 
> Look at the AST model in OpenVMS.  An AST is like a software interrupt.
> 
> With a AST behaving like a software interrupt, it solves some of the 
> "synchronization" problems, as only one of them can be active for a 
> process at once.  It also means that you do not want any AST to be in a 
> wait state.
> 
> When an asynchronous I/O is posted on OpenVMS through the native I/O 
> system, it also has the address of a completion AST, which it passes an 
> argument for so that the completion function knows the context.
> 
> By stringing these ASTs along from an origination point such as a task 
> dispatcher you can build a threaded application with out the overhead or 
> the stack limitation of Posix threads.
> 
> It still does not solve all security issues that have been mentioned, it 
> has improved scaling over both the Posix Thread model and the per 
> process model.

See... that's the kind of thing I was curious about.  What other ways are 
there of looking at and solving some of these problems.

> Such a solution generally does not result in portable code.  Finding a 
> way to implement it portably on the various UNIX platforms could be a 
> challenge.

...and this is what I wasn't looking at.  This is the practical side of 
the equation and, while it's very important (particularly to the work 
Tridge has done) I ruled it out as a consideration specifically because it 
was a practical consideration.

Thanks!

Chris -)-----

-- 
"Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   crh at ubiqx.mn.org
OnLineBook -- http://ubiqx.org/cifs/    -)-----   crh at ubiqx.org