Returning the size of the file to Clients

Green, Paul Paul.Green at stratus.com
Fri Dec 20 14:45:01 GMT 2002


John E. Malmberg [mailto:wb8tyw at qsl.net] wrote:
> Samba makes calls on behalf of the client to return a file size.

Samba also makes POSIX stat() calls on its own behalf, only some of which
actually care about the file size (some of them are checking access, for
example).

> The problem for this on OpenVMS, is that some of the text file
> sizes include the record information.

I have the same problem on Stratus VOS. We natively implement COBOL file
types (sequential, relative, and fixed), each of which contain record length
information in addition to the data, and which also, when used to hold text,
typically do not include the trailing newline character in the record.

> When these files are sent to the client they are converted to a
> byte stream format like UNIX uses.
> 
> But this results in a file that is a slightly different size than the
> physical size of the file, usually smaller.

Agreed.

> Only some applications, such as wordpad seem to be sensitive to
> this, as others use the amount of data transferred.  It has
> been reported that wordpad adds garbage bytes to the end of
> the buffer for the difference.

WordPad was my Achilles heel as well. I ran a trace and found that Samba
2.0.7 called stat() 48 times on a small test file when Windows NT4 was
listing a directory and opening a file with WordPad. I don't know how many
of these calls were due to Samba and how many were due to WordPad.

> The 2.2.4 port of Samba to OpenVMS solves this by reading the
> entire file in order to give the correct size.  This of
> course creates a big performance hit when displaying a directory.

I did the same thing, in our implementation of (f)stat.  I was at least able
to use an OS call that merely returned the size of the next record, w/o
transferring the data, but it was still a big hit.  I contemplated creating
a "file size cache" in our POSIX runtime to at least reduce the number of
times we had to do this.  But in the end, we decided that this was a general
problem that the VOS kernel had to solve, so we taught the kernel to keep
track of the "POSIX" size of a file in parallel with the historic,
proprietary size.  And then we taught the POSIX runtime to use this value
when it was available.  Huge improvement.

> Is there anyway to differentiate for when the Client is opening a file
> for an application, and when a directory is being listed?

I never found one.  But I wasn't familiar with the SMB protocol, either, so
all I had to go on was the Samba source code.  For a while, I had patches to
Samba to call two different versions of (f)stat; one that would return the
estimated size, and one that would calculate and return the exact size. The
problem is that there are a lot of calls to stat within Samba, and as you
discovered, some of them come from the client at the other end of the wire,
and have to return the right value.  I couldn't really get my scheme to
work, and so gave it up and changed VOS.

> I am also going to look to see if there is a more optimal way to
> calculate the size of these text files.

Well, we didn't patent our method, so you are free to use it... :-).

Thanks
PG
--
Paul Green, Senior Technical Consultant, Stratus Technologies.
Voice: +1 978-461-7557; FAX: +1 978-461-3610; Video on request.




More information about the samba-technical mailing list