Samba improvements needed

Thu Sep 23 17:40:54 GMT 2004

In article <1095951537.3981.64.camel at bgpc.dymaxion.ca>,
  BG - Ben Armstrong <BArmstrong at dymaxion.ca> writes:
> On Tue, 2004-09-21 at 08:32 -0500, John E. Malmberg wrote:

>> And it is not preallocation that SAMBA is doing as noted below.
>
> Oh?  It sure looked like it ...

If you look at it from stepping through the file create sequence in the SMBD
program it does.

>> The client does tell the SAMBA server how big to make the file, but not
>> when VMS can do it efficiently.
>>
>> First it has the server open the file for write access, and then it uses
>> ftruncate() or other means to extend it.  Most of these move the highwater
>> mark of the file, unlike just allocating space.  And that means that the
>> now empty file must be totally written to disk.
>
> What's the distinction between this and preallocation?  Is it that the
> client does the file extending writes in small increments, whereas
> preallocation would do it all at once?

The issue is that the way that VMS is doing it now, is slightly more overhead
than should be needed.

First, the open/write creates an empty file.

Second, the SAMBA requests that the empty file be extended to the size
the client says the end file will have.

This is done one of three ways, and I am not sure which method that
SAMBA 2.2.8 is using.  Method A, moves the high water mark and allocates the
space.  Method B, writes an empty file of that size.  Method C ignores the
request.

Third, the data is written to the file.

By delaying the open until there is actual data to write or the client has
specified the resulting size, then Method A can be used.

The other methods may show a performance hit, but this should not be
reflected in the negotiated transfer protocol.

> It does seem that even in the WinXP -> Linux case where all of the
> extending single-byte writes were done up front, there were way too many
> of them.  I could well imagine that this performs poorly on OpenVMS
> given my (admittedly limited) understanding now of the hit we take for
> each preallocation.  The worst run of them in my capture log started at
> offsets 1058815, 1059839, 1060863, etc. (i.e. 1024 byte increments) all
> of the way up to 2269183 before writing actual data blocks again, taking
> a total elapsed time of 0.94 sec.  If this strategy had been used on
> OpenVMS, I gather the elapsed time would have been much worse.

There may be a difference in a file transfered by a copy and for an application
doing an open/write.  Still the amount of data transferred with each
packet is a function of the protocol, not the VMS file system tuning.  Samba
has no read.

>> >From looking at the current structure of the UNIX SAMBA code, it looks like
>> the way to improve performance is to write VFS modules specific to ODS-2
>> and ODS5.  The ODS5 module would not need any filename mangling.
>
> For the record, we use ODS-2, and switching to ODS-5 would be a major
> ordeal, as we have all of our clients to consider, not just our own
> systems.

I understand that ODS-2 will be in use for quite a while.  Supporting the
Pathworks naming convention does incur significant overhead.  Right now,
one of the major hits for that overhead is effectively just to determine
if the disk is ODS-5 or ODS-2.  Getting that information requires a disk hit.

Separating them into separate VFS modules will give a performance boost
to directory listings for ODS-5 file systems that should be very noticable.

> Not being very familiar with SMB or the Samba implementation of it, I'm
> not sure what specific implications that has for the tests I've been
> performing.  What do I look for in my packet capture logs to see the
> actual size of the file being communicated by the client to the server?

That information has expired from my memory cache as it is over three years
since I looked at it.  And I looked at the numbers after SAMBA had extracted
them from the packet, so I may never have actually seen it on the wire.

And since you are using an application instead of a file copy, I do not know
if the application actually knows the total size of the resulting file to
send.

> But really, what I'm after for now is anything that might help *without*
> coding changes, if that is at all possible.

If Richard Sharpe can find something that can be changed in the smb.conf, then
it is possible.  Otherwise, I suspect that a code change will be needed.

Myself, I am still just getting back on this bicycle, and do not yet have
my VMS 8.2 minor enhancements working.

-John
wb8tyw at qsl.network
Personal Opinion Only