Samba improvements needed (was: WinXP<->Linux samba server test)

Thu Sep 23 14:58:57 GMT 2004

On Tue, 2004-09-21 at 08:32 -0500, John E. Malmberg wrote:
> > Tcpdump, which is available for TCPIP 5.4, can read it.  For example:
> >
> > $ tcpdump=="$sys$system:tcpip$tcpdump"
> > $ tcpdump -vr test.cap
> 
> I do not have a LINUX system running at this time.  Three years in my
> new house, and I have not had time to organize that section of the basement
> to get one of now ancient 100Mhz/90 Mhz sytems wired up to the LAN
> let alone running LINUX.

Well, in the meantime, do try the OpenVMS implementation of tcpdump, if
you have TCPIP 5.4.

> It may be a difference between Samba 2.2.8 and later versions.

A quick way of setting up a Samba 2.2.8 server would be to find a "live
CD" version of Linux with Samba 2.2.8 on it.  There are plenty of live
CD Linux ISOs out there to burn.  I favour Knoppix, but that may be too
recent, since I believe they start with Debian/unstable & then tailor it
with the Knoppix-specific stuff.

> And it is not preallocation that SAMBA is doing as noted below.

Oh?  It sure looked like it ...

> The client does tell the SAMBA server how big to make the file, but not
> when VMS can do it efficiently.
> 
> First it has the server open the file for write access, and then it uses
> ftruncate() or other means to extend it.  Most of these move the highwater
> mark of the file, unlike just allocating space.  And that means that the
> now empty file must be totally written to disk.

What's the distinction between this and preallocation?  Is it that the
client does the file extending writes in small increments, whereas
preallocation would do it all at once?

It does seem that even in the WinXP -> Linux case where all of the
extending single-byte writes were done up front, there were way too many
of them.  I could well imagine that this performs poorly on OpenVMS
given my (admittedly limited) understanding now of the hit we take for
each preallocation.  The worst run of them in my capture log started at
offsets 1058815, 1059839, 1060863, etc. (i.e. 1024 byte increments) all
of the way up to 2269183 before writing actual data blocks again, taking
a total elapsed time of 0.94 sec.  If this strategy had been used on
OpenVMS, I gather the elapsed time would have been much worse.

> >From looking at the current structure of the UNIX SAMBA code, it looks like
> the way to improve performance is to write VFS modules specific to ODS-2
> and ODS5.  The ODS5 module would not need any filename mangling.

For the record, we use ODS-2, and switching to ODS-5 would be a major
ordeal, as we have all of our clients to consider, not just our own
systems.

> In that module, when a open +write access is done to create a new file,
> the VFS would delay actually opening the file until either the first
> data is actualy written, or the client (as per usual observed practice),
> sends down the actual size of the file.

Not being very familiar with SMB or the Samba implementation of it, I'm
not sure what specific implications that has for the tests I've been
performing.  What do I look for in my packet capture logs to see the
actual size of the file being communicated by the client to the server?
Does that correspond to the 1-byte writes?  And if that's the case, how
does the server know that write is any different from the actual data
write of 1024 bytes?

> More buffering may also help.  The VFS approach may allow more efficient
> application cache management and tuning.

I'll leave that up to you expert filesystem guys to look into.

But really, what I'm after for now is anything that might help *without*
coding changes, if that is at all possible.

Thanks for all of your help so far.

Ben