How to move storage OEMs to Samba 4.0 ?

Wed Jun 6 18:12:48 MDT 2012

Hi Jeremy,

I apopologize for the length of the mail up front.
But I have been sitting all night over it, elaborating some
thoughts...

Jeremy Allison wrote:
> On Wed, Jun 06, 2012 at 11:49:05AM +0200, Volker Lendecke wrote:
> > 
> > It is the sum of changes that went in. gensec,
> > security=share gone, security=ads requires a more advanced
> > environment, merged loadparm, possibly ntdb as a new
> > default. This is enough to scare people away from s4 for the
> > OEM case for at least 12 months if not much more from now.
> > I would say that we can't leave those in the cold for so
> > long. We have a backlog of features that we don't accept
> > right now, and this will only grow over the next year or
> > two.
> 
> This is the truth of it. Seriously. Volker and I have
> a lot of experience working with OEMs and you really
> need to trust us on this.

Well. I don't see what is new here. What is so different
in this release step compared to the previous release
upgrades?

Let me try an analyis:

The situation for the OEMs that you have described seems
to be pretty much the same for this release as for previous ones:

* OEMs love our stable releases for their stability.
* OEMs yearn for using the new release for the features.
* But OEMs fear the new release for its potentially missing stability.
* Therefore OEMs stick to stable release for >= 1 year
  after new release has been made and backport some of
  the features to the stable release they use.

It has been like this before, for many releases.
And it will be like this for 4.0.

But now it is suddenly a more severe probelm?

Is it really that OEMs themselves have an especially increased
fear of the 4.0 release? Maybe because of the major version bump?
Or because of some technical insight in what has gone one
in source3 since 3.6?

Or is it rather that you yourself are more afraid of the next
release 4.0 than you have been of previous releases?
The things that Volker has listed above as changes
are all changes that none of you (or me) have authored.
That would be not so bad an explanation of increased fear. :-)

But this would not be a fear that I'd expect the OEMs to share.

For users (like the OEMs) that have a main interest in the smbd
file server, this release will appear different most visibly
in that there is a new mode of building the server. Then there
will be (if we manage to get it in in time) a whole new set of
SMB2 features (durable handles (of 2.0), dynamic reauthentication
(of 2.1) and the basic 3.0 support) - not in yet. Finally there
are many smaller or bigger changes, performance
improvements, vfs modules and whatnot. (If I missed something
here, I apologize, and please fill it up.) A couple of legacy
features are about to be dropped. -- So I am convinced that from
this perspective, the release must look very similar to earlier
releases.

Detour: view from inside samba

But from inside samba, for those who have worked on the samba3
file server with complete dedication for many years, and
virtually had the code under their control, those not so publicly
visible changes that are a byproduct of the merge-samba3-and-samba4
efforts, may be able to create more fear than any ever so big
changes they have done themselves. This is quite understandable.
And apart from all the useful things that the code-merge (imho)
creates, there is a real danger lurking:
This danger is is called nested event loops.

I hope to quote Volker correctly in trying to summarize it
briefly: calling an event loop from within an event handler on
the same event context, the world has changed for the event
handler when the inner event loop returns. This can lead to
arbitrary behaviours from infinite nesting (i.e. stack
exhaustion) to unexpected socket state and so on. This pattern is
therefore banned in samba3 code. In samba4 code, the pattern is
used in several central places. (Maybe the chances to hit this
problem are higher in the central event loop of a file server
that is under real pressure than in the AD server, at least the
samba3-devs have been more strict about this point.)

The use of nested event loops in samba4 is a design that some agree
should be changed. But it will not happen very soon because it
will be non-trivial to do and require a lot of effort. So when the
merge project creates subsystems in the base directory these
subsystems often contain the potential to create nested event
loops, when they originate from source4/.  By using new merged
subsystems in source3/, the possibility is created to get nested
eventloops in through the back door.  S3 code is currently
protected against it in that it panics when a nested event loop
is called. And there are a few more or less implicit mechanisms
to control the behaviour of the base subsystems in question.
But this is fragile. If someone misuses it, it will still panic
but it would be best to instead have an obvious and foolprof
protection mechanism that would forbid s3 to use the base
libraries in a way to allow for nested event loops.

I think that this is the biggest internal concern about the
state of the smbd file server in the 4.0 release.
Summing up, I think the state of the file server is in fact good!
In order to keep it safe we might think about better
protection of the misuse of base libraries in source3/ .

Back to OEMs:

This was a detour about the internal perception of the file
server in the 4.0 release. Coming back to the perception of
our OEMs and what we should tell them and how we can help them.

As (my thesis is that) the state of the file server in 4.0 is
generally quite good, I think we should  encourage the users
(OEMs/distributors) to use the file server from the new release
for new features in the same way that they have done in previous
releases. Fileserver-wise this will be a normal release. Not a
small one, but also not a complete overhaul.

The OEMs have created and maintained backport patchsets for their
product in the past. And they will do so for 3.6 when 4.0 is out.
The question raised is how to best support them in doing this.

I doubt that maintaining a release that contains backported
features from various OEM patchsets will be the right way to do
this. Kai has nailed some downsides of this approach in his
several mails. Let me repeat some and add some thoughts:

* Each OEM maintains his own set of patches that the OEM itself
  tests carefully.

  If we blend subsets of the various OEMs' patchsets into a
  release, it is by no means certain that the result will
  be useful or stable for all of them, because each OEM has
  only QA'd his own patchset, not the blend.

  More importantly, it is uncertain that this release will
  be what most OEMs want! (feature-wise)

  Put more pointedly: We also don't have a round table of OEMs
  that discusses the common denominator of all OEMs' patchsets.

  I consent that it can be expected though that many of the OEMs
  woud switch to such a feature-backported version if the patches
  had been choosen carefully.

* Some or most(?) of the OEMs also have patches that they would not
  want to go upstream. So we wont't be able to take the need
  to maintain a patcheset on top of the release from the OEMs
  completely. The patchset just might get a little smaller.

* Doing more of such backports for a release will create
  additional load. If you really want to do this as a service
  for the OEMs we should probably do it for 3.5.X as well,
  since this is what to my knowledge most of them are still
  using. So even more load. And we are already short of
  development ressources.

* I can also imagine that adding features to the 3.6.X releases
  (and not to earlier releases, say 3.5.Y) right before 4.0
  comes out, will spread FUD about the 4.0 release and
  will deter users from switching rather than encouraging them.

* The rule to not add features to the stable release branch
  is there to protect the stability of the release and to
  protect our resources, so that we can concentrate more
  on the newer release.

  Of course, backporting features does not necessarily
  mean that the branch gets utterly destabilized, but frequently
  a new feature will introduce new bugs that need subsequent
  bugfixes to be backported as well, and so on. I.e., by
  introducing new bigger hunks of new or massively changed
  code, we usually also introduce new bugs. This is some
  form of destabilization.

* Also the notion of what is a reasonable and useful feature
  to backport is completely arbitrary.  In saying you want to
  "relax the rules on functionality additions to 3.6.x so that we
  can add *some* limited new functionality", what do you imagine
  more concretely? This sounds so vague that it will certainly
  lead to dissensus and create room for sneaking stuff in.  Where
  to draw the line? This is difficult.  The current rule is
  simple (but cruel). :-)

To sum up:

* I don't think the OEMs will perceive the 4.0 release as
  especially theateneing -- file-server wise.
  So why is it more important now to backport features into
  the stable release than it has been before?

* Creating a release with backports will not remove the need
  to maintain individual patchsets from our OEMS completely.

* It may not even be what they want.
  What *do* the OEMs themselves want from us?

* Opening up patch policy for the release does create the
  danger of destabilization. Of course this is not intended
  but the danger is there.

* I could agree to allow backports of isloated new code like vfs
  modules or command line tools.

* Can we start out with offering a loose collecting of useful
  backport patches from OEMs and form elsewhere in a different
  place than a release?  It could be bugzilla or a webpage or
  whatever. No guarantees no release process...

I don't have my final conclusion yet.

Good night - Michael

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 206 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20120607/538eaf43/attachment.pgp>