[Fwd: Re: [PATCH] Fix Name mangling in HEAD]

Tue Mar 26 02:50:02 GMT 2002

Simo Sorce wrote:
> 
> On Tue, 2002-03-26 at 10:12, Andrew Bartlett wrote:
> > Lets try to get this to the list this time...
> >
> > -------- Original Message --------
> > Subject: Re: [PATCH] Fix Name mangling in HEAD
> > Date: Mon, 25 Mar 2002 15:47:27 -0800
> > From: abartlet at samba.org
> > To: Simo Sorce <idra at samba.org>
> > CC: Andrew Bartlett <abartlet at pcug.org.au>, samba-techincal at samba.org
> > References: <3C9EE2AD.76905AE7 at bartlett.house>
> > <1017053527.4161.12.camel at berserker> <3C9F8A60.A70D2DD1 at bartlett.house>
> > <1017098751.1813.25.camel at berserker>
> >
> > On Tue, Mar 26, 2002 at 12:25:50AM +0100, Simo Sorce wrote:
> > > On Mon, 2002-03-25 at 21:36, Andrew Bartlett wrote:
> > > > Simo Sorce wrote:
> > > > >
> > > > > Sorry Andrew but reverting back to 2.2 code is not the way to go on my
> > > > > opinion, I made that code to solve many problems and bugs that apperead
> > > > > in 2.2. code. The way to go is that me and tridge discussed on IRC, eg:
> > > >
> > > > I'm increasing of the opinion that the 2.2 approach is the only valid
> > > > way to do this.  We need an approach that scales with the size of the
> > > > server - and your mangling TDB *DOES NOT*.
> > >
> > > I know it currently *DOES NOT SCALE*, I was aware of the problem since
> > > the beginning, and I made this implementationa s a proof of concept to
> > > understand what could be the benefits and the problems.
> >
> > If somthing is a 'proof of concept' why is is the only option available?
> 
> Please Andrew, HEAD is an alpha version, if we do not experiment there
> where shoud we? If I give an option (that was there in the first
> commits) who would have used it?

No, and I would have been much glader for it.  This 'feature' has caused
my much pain over the last few weeks, hence why I have been forced to
remove it.  

One of the things I really like about Samba development is that we don't
intentionally break things - particuarly when we expect people to
use/test/debug it.

> > You really should have allowed a swtich, particuarly given the *known*
> > limitations.  The 2.2 implementaion may not have been perfect, but it
> > genarally works!
> 
> Of course, as 2.2 is the production release!

The problem is that some sites don't have that option - and I know of a
number of NAS vendors that are developing/shipping HEAD based code, due
to the need for AD compatibility.

> > > > The results are quite painful - and I know the code was completely
> > > > *untested* (It caused quite a few problems at my site - hence this and
> > > > previous patches).  Even moving to a larger hash presents *major* issues
> > > > in scaleability - your mangling DB would have to be the same order of
> > > > magnitude as the whole filesystem's combined metadata!
> > >
> > > It was tested in the limits of my free time, we spotted some bugs, and a
> > > hard limit.
> >
> > The problem is that you didn't address the hard limit - and people are
> > now
> > attempting to use systems based on this.
> 
> We can always revert back, I'm not defending my code as currently is,
> but the 2.2 code is not that good, specially if you want to use it on
> large directories on NAS appliances.

NAS appliences are where it proves to be the most unsuitable, and I
agree, the current code is indefensible.  

> > > > Worse still, a mangling TDB does not reflect changes in the filesystem -
> > > > we keep stale entries around *for ever*.
> > >
> > > Yes unless we put the code into VFS, putting the code there we will have
> > > hooks to unlink call and we can easily remove entries (of course we will
> > > not be able to do so for unix-side deleted ones, but we can easily
> > > afford to create a tool to be run by cron at midnight that will parse
> > > the fs and clean the tdb.
> >
> > Won't work.  Muliple files in the filesystem have the same name and the
> > same mangling.  So you need a reference count - but that will be *really
> > dodgy* when you don't control the whole FS.
> 
> We will not ever have the perfect solution without fs support! But the
> use of a hash will make it closer to the point.
> 
> > > We _need_ the tdb, or every smbd reload/crash/shutdown (or cache limits)
> > > could easly end up changing the mangled name and be sure windows app
> > > will not be happy of that!
> >
> > The mangled name doesn't change.  Its a hash, and the function remains
> > *constant*.  Becouse of the way we match filenames, even if it isn't in
> > the cache it works.  The only problem occours when we have hash
> > collsions
> > or when we are doing ops that don't allow us to check the dir.  (like
> > copy).
> 
> Yes if you use ONLY the hash, but we can always use hash+counter as
> windows 2000 does yet now (and this requires the tdb).

Why do we need 'hash + counter' BTW?  Becouse Windows does it?  If you
need the filename to look like a 'hash + counter' it would be much
better to simply use the 'counter' as part of the hash.  No reason it
has to start at 1.

The number of places and style of hash can make smb.conf paramaters
without much pain at all.

Andrew Bartlett

-- 
Andrew Bartlett                                 abartlet at pcug.org.au
Manager, Authentication Subsystems, Samba Team  abartlet at samba.org
Student Network Administrator, Hawker College   abartlet at hawkerc.net
http://samba.org     http://build.samba.org     http://hawkerc.net