[Samba] Re: Looking for a set of definitive answers (long)

Avery Payne apayne at pcfruit.com
Thu May 22 19:48:14 GMT 2008


On Wed, 21 May 2008 15:31:48 -0500, John H Terpstra wrote:

> Avery,
> 
> OK - I'll respond too.  I see Jeremy has beaten me to it.
> 
> Let me tell you up front, if you want the documentation to be improved
> the best thing you can do is contribute changes and updates.  Making us
> aware of docuentation problems is a good start, but please take this a
> step further - send us your updates and changes.

More than happy to  as soon as I can find the time. :)
'
> 
> One other thing, before I get too far into answer or commenting is this:
> The Official Samba3 HOWTO and Reference Guide (TOSHARG) is a document
> (book) that sets out how specific parts of Samba function.  It was never
> intended to provide a working template or a scripted recipe.

Understood.  I am using it as a tech reference.

> I did write the Samba3-ByExample book with the specific objective to
> provide detailed step-by-step, fully worked, examples of real working
> networks, did you consult that document at any time?

I didn't even know it existed.  The majority of web queries resulted in 
the online version of TOSHARG being displayed.  Thanks for pointing that 
out, I'll look for it.

> Are you offering to improve its value and utility by contributing your 
> experiences and recommendations?

Yes, as time permits.

> 
> On Wednesday 21 May 2008 01:47:52 pm Avery Payne wrote:
>> Question:
>>
>> The goal was to create a file server that had excellent performance
>> while providing Volume Management, but we felt that something like
>> Veritas was overkill for our needs.
> 
> A noble goal that can be achieved.

I think we're 99.9% there, it's that 0.1% that's holding up the works.  
Overall, everyone is pleased.

> [lots 'o stuff snipped] 
>> The proposed solution was a Samba file server running on a pair of
>> redundant servers, with one connected to an eSATA raid box, with LVM
>> and Ext3 providing volume management and journaling.
> 
> I would not architect the solution this way.  There are way too many
> pitfals with this solution.  You have identified one already - the SID
> <=> UID/GID mapping challenge.

The solution is that there are nightly backups of all the tdb's to a 
known LVM volume.  The idea is that in the even of manual failover, the 
volume would be mounted, the tdb's copied into place, some minor settings 
changed, and the service started.

Originally I was aiming for a "clustered" approach but it appears that 
the software (both the OS and Samba) were not ready for this - yet Samba 
4 may still surprise me. :)

> 
> I would have used a RAID5 array in each server with rsync to synchronize
> from the master to the slave.

There is no master-slave, the other machine is a cold-standby solution.  
The RAID 10 array contains 16 drives on a eSATA box that has redundant 
power, redundant connects, etc.  A manual failover was chosen by 
mangement due to cost and software constraints.  The downtime involved 
was deemed acceptable - 5 to 10 minutes.  Downtime exceeding 15 minutes 
however would start creeping costs into the red.

> 
>> Our transition was a
>> bit rough, but in the end it has been very stable and fast.  We have
>> been really pleased with the performance of the hardware/software
>> combo, seeing sustained throughput of about 250Mbyte/sec with peaks as
>> high as 300Mbyte/sec.  But along the way, we encountered some oddities,
>> and I have some remaining questions.
> 
> What lab work did you do in a test environment before rolling this life?
> Proper pre-rollout evaluation can save a lot of head-banging later.

3 months.  This is an epic story for another time. :)

> 
>> - File permissions do not behave as expected (from the viewpoint of
>> other staff working with the server).
>>
>> [snip]
> 
> Samba is an engine that sits on top of a host OS. That host OS is NOT
> Windows. Samba has to go along with the rules imposed by the host OS. 
> The TOSHARG chapter on "File, Directory, and Share Access Controls"
> should be the red flag that underlying file system semantics are exerted
> by Samba.  Windows admins need to be trained to understand that Samba is
> not Windows NT/2Kx, etc.

Point appreciated.  As a Linux admin (since '98) and a Windows NT admin 
(since '97), I can appreciate the semantical differences between the two, 
and the efforts involved by the Samba devs to make things "work".  I did 
read those sections (repeatedly).  Sometimes it's easy to miss things 
when the world is at your door screaming for blood - especially when it's 
your blood.

As for the admin training side, my co-worker is an MCSE coming from 20 
years of VAX/PDP experience, and the department head (my direct boss) is 
an OS X fan who uses 10.5 on a daily basis.  Which makes for interesting 
conversations between us all. :)

> 
> Jeremy's notes about the VFS modular work aimed at providing better
> Windows ACLs emulation may provide the solution you are looking for.

I have read about some of the VFS features and will be using the audit 
feature in the near future.  However, some of what I face may be a 
versioning issue due to the vendor (or not).

> 
>> After explaining that there
>> would always be three settings no matter what, that they could never be
>> deleted, and that they represented actual filesystem-level bits that
>> wouldn't go away, it was accepted.  I didn't notice if this was in the
>> docs or not, but I certainly didn't find it.
> 
> It would help me to understand your problem if you can point out how you
> went about searching for answers.  What questions did you frame mentally
> in your search?

Q: Why can't I delete these entries inside of the directory/file property 
pane?  They keep reappearing no matter what I do. (the question phrased 
as given to me by both my coworker and my boss) 

The issue is that the frame of reference - "this is how we set file 
permissions and this is the expected behavior of the vendor-provided GUI 
tool" - is somewhat different from the material that is being presented.  
I have had to do alot of verbal footwork in translating things.  Perhaps 
I can help here with some new material...

> Where and how did you look?

"The Official Samba-3 HOWTO and Reference Guide", Second Edition, 
(c) 2006 John H. Terpstra, printed by Prentice-Hall, Professional 
Technical Reference.

The quickest solution to the issue was found with deductive reasoning 
coupled with emprical experimentation.  Good old-fashioned 'sit down and 
work it out'. :)

> Did you use a hard-copy of the book?

Yes.  See above book reference.

> Search online in the HTML web
> pages?

Yes, both through site searches at samba.org and through google.  
Unfortunately, they ended up being web versions of the book.

> Or did you download the PDF of the book and use the hotlinked
> pages in the subject and topic indexes?

Yes.

> 
>> [snip]
>>  We've found that this method has the closest behavior to a
>> "real" Windows server and has satisfied everyone.
> 
> Please write this up in a step-by-step form that can be added to one of
> the books.

More than happy to, as time permits. :)

> 
> 
>> - Permissions don't propigate through the filesystem.
>>
>> [dumb issue that I could have avoided snipped]
> 
> You can also add to the smb.conf share definition stanza the following
> parameters that are documented in the smb.conf man page:
> 
> 	inherit owner
> 	inherit acls
> 	inherit permissions

Thanks!  The reason they were not used initially is that the initial 
permissions model didn't account for that (because I didn't know the 
settings were there), and on top of that, it changed during the 
transition.  Some of the filesystem is using the original model - where 
domain groups are mapped to a GID on the linux box and the directory/
files are then updated with that GID - while other directories are 
following the ACL approach.  Throw in some complete confusion by the 
vendor's muddled guidence, and it gets messy.  As I have time permitting, 
the entire filesystem will be migrated to the ACL approach.

> 
> Did you consider these?

No (see above).  It will be rememdied.  Lesson: when in doubt, stick with 
your gut, because the vendor usually doesn't get it right on the first 
try.

> If so, what problems did they cause you?

N/A (see above)

> 
>> This has also
>> caused a bit of grousing as we have several nested directories with a
>> heiarchy of permissions; getting one parent directory wrong means
>> rebuilding permissions for several child directories as well.  I have
>> never been able to get a satisfactory answer as to how to resolve this
>> issue, other than the process I described above (which I had to resolve
>> for myself without documentation).
> 
> OK - I understand the problem.  What did you do to bring about better
> documentation?

I repeatedly hit myself on the head with a baseball bat. Just kidding.

The documentation was created in-house at my employer's expense.  I will 
attempt to re-write the docs I created on my own time and provide them to 
you later.

>  Did you consider contributing some guidance
> documentation and then circulation to get positive feedback from other
> Samba users?

Yes, but the time factor involved was prohibitive.  Hard to do the right 
thing when you're holding off end-users with a pointy stick.  See my 
other post to Jeremy re: users screaming for blood.  That, and it's hard 
to think straight after working a 38 hour shift with no sleep, no food, 
no breaks, and no rest.

> 
> Better documentation is always welcome.  Contributions are continually
> sought.

More on that later.  I have some ideas that I think would be worthwhile, 
and I'm willing to type them up.

>> - To oplock or not to oplock: that is the question
>>
>> The documentation is not entirely clear about when you should and
>> shouldn't use oplocks on shared files. [frustrations snipped]
> 
> I have not seen a single installation where DBF files and MDB file
> sharing works acceptably with more than 3-5 concurrent users.

In our prior version of the software we had 50 workstations, of which 
half were concurrently active, updating/reading DBF files every day.  
Locking is kept to a minimum, which contributes to a very low latency.  
The general "gut feel" we have here is that the limit of our system is 
probably around 50 concurrent users; on busier days with more than 30 
workstations active, the system started to visibly slow, although 
performance degraded gracefully.  The setup (at the time) was a Win2k 
Server with 50 Win2k Workstations on switched 100BaseT, with gigabit 
going into the server.

> 
> Best advice is do not use file sharing for multiple concurrent access
> database files.  Instead, use a SQL backend database.

The ERP package in use allows us to do so for a very prohibitive price.  
It would actually be cheaper to employ me for an entire year to develop 
an object wrapper to intercept all calls to the DB API and send them to a 
PostgreSQL or MySQL backend, working full time on this project and doing 
none of my other duties (networking, sysadmin, security, email services, 
data warehouse, end-user support, customization of the accounting 
package, etc.), while struggling to emulate all of the features with only 
the API interface as a document to guide me (there is no source 
available).  Of course, that isn't going to happen. :( 

> 
> 
>> - Office file locking workaround(s) were not immediately obvious
>>
>> [snip]
> 
> Where in the book would you prefer to see this documented? What changes
> would you make to the documentation to make this more prominent and more
> readily capble of being found?

The largest issue encountered revolved around locking semantics and how 
the system would behave.  This I will fully address later when I write up 
my docs and send them to you.

>> - What?  You want me to unlock that file?
>>
>> We have had recurring instances where a workstation on the network has
>> seized a DBF file and held onto it, not allowing any other workstation
>> or server to perform writes to the file.  This locking issue shows up
>> in random intervals and always requires that we have the person quit
>> the program we are using and log back in.  It is not an application
>> issue that we can determine - the rest of the system continues to
>> funciton, it just prevents one of our servers (or anyone else) from
>> locking the file.
> 
> This is a client-side caching issue. Samba does not know that the file
> has been released until the client notifies Samba that it has released
> the file. Windows clients can go down without ever releasing the file. 
> There are Microsoft KB articles on how to disable client-side caching. 
> Should this be more vigorously documented in the HOWTO?

Yes, although I would prefer the term "slightly expanded article". :)

> If so, where should it be documented in TOSHARGs?

The viewpoint comes from a small-to-medium sized LAN installation, using 
shared-file databases (dBase, FoxPro, Paradox, Access).  This may not be 
very common in larger installations, but I can attest (from personal 
experience) that it is frequently used in smaller setups.

I think the real issue is that there is a juncture between file locking 
and file permissions that are critical to having those environments work 
correctly.  A small paragraph about shared-file databases would probably 
go a long ways towards providing clarity to the implementor as to how 
things will work, etc.  Again, I'll try to get something to you later.

>> - Speaking of which - just WHO does have that file lock?
>>
>> [snip]
> 
> What tool are you using to explore this?

Computer Management Snap-In.

> 
> 
>> - You sure you still have that file open?  It says you don't even have
>> it!
>>
>> The computer management tool also has an issue with data appearing to
>> be stale.  Workstations that have been powered down still show a file
>> open.
> 
> See comment to previous question about this.
> 
>> Or in some cases, the workstation is working with the file, but no file
>> handle appears in the tool! [snip]
> 
> This sounds like a bug.  How can we reproduce this?

See my posting to Jeremy on this.  I'll try to develop a specific test, 
but the description I gave is the best I can do for now until I can 
catagorize the bug further.

> 
> 
>> Now, the remaining question(s):
>>
>> - The vendor initially set up our authentication via tdb files and
>> Winbind.  We have been using this combination succesfully for some
>> time, but in the Official Samba Guide it talks about regular
>> maintenance of the tdb files via tdbbackup.
> 
> Unless I am sadly mistaken, the TOSHARG docs are correct.  You really
> should use tdbbackup on a regular basis in every large installation.

Jeremy has confirmed this.  Thanks for the 2nd confirmation. :) And I am 
backing up all tdb's daily to an LVM volume on the RAID array, so when 
the 2nd system goes live, I can simply restore them with tdbbackup.



> 
>> [snip]
> You need to set your client
> registry correctly to stop client-side caching for MDB and DBF files.  I
> do believe that is documented somewhere in TOSHARG.

I am using both global and share-level "veto oplocks" settings, with a 
very detailed match string. :)

> 
> 
>> A parting thought as well:
>>
>> It would have been nice to have had a reasonably generic template or
>> example somewhere that pointed this out [snip]
> 
> Where would you search for this information? Which chapter? Which book?
> How should it be documented?  Are you willing to write this up in a
> usable form?

1) it should be part of a section on implementing replacement file 
services for Win2k(3) boxen.
2) undetermined.  I will have to look at that.  I am already almost over 
my time limits on this reply today (forgoing lunch to reply).
3) again, undetermined.
4) Yes.


> Everyone can be mistaken, fortunately not everone is always mistaken.

:(

Part of my job is to pull magic rabbits out of the hat.  I fear the day 
when I reach in the hat and come up empty-handed.




More information about the samba mailing list