Moving 8.3 filenames into VFS - WAS Re: meeting with SUGJ

Tue Jul 24 21:45:19 GMT 2001

On Thu, 19 Jul 2001, Christopher R. Hertel wrote:
<snip>

> I have seen some vendors write their own filesystems.  Other vendors have
> or are considering writing OS-level VFS filesystem modules on top of
> existing filesystems.  I believe that Linux lets you build a virtual
> filesystem on top of ext2fs, for example, thus saving a lot of time.  The
> BSD flavors also make this quite easy.  In these cases, you would mount
> the virtual filesystem rather than the real filesystem that the OS-VFS
> layer uses. 
> 
> My point was that the name mangling code in Samba should be in the default
> Samba-VFS layer code so that it can be easily replaced with a different
> Samba-VFS layer that would deal with OS filesystems that can do their own
> mangle-management. 
> 
> Does that make any more sense?

A bit.  It looks like something place for me to look at placing operating
system specific / file system specific stuff.

The emulation of the dos mode bits should also be in that layer.  Instead
of mapping them to the UNIX mode bits, a file system that supports ACLs
can better handle their emulation.

The SAMBA VFS layer though can be much more than just wrapper calls to the
underlying VFS.

For me, it would be nice to know how big the resulting file is going to be
at the time the creat() call is passed.

If SAMBA knows that from negotiating the file transfer, it could pass that
information along to the to the VFS creat() wrapper.  Those file systems
that can preallocate more efficiently than they can extend a file would
get a performance boost.

But name mangling is a special pain if you do not have a filesystem that
supports it.  The 8.3 name must uniquely identify a file in a directory.
So unless some special method is found, all files in the directory need to
be searched, once for a real 8.3 name, and then for conflict of the
mangled name, before the mangled 8.3 name can be persistently associated
with the real name as an older program would expect.

For the 8.3 names, the simplest scheme I can think of to avoid conflicts 
or stale matches is ugly to look at, but would involve encoding the inode
in base 48 replacing the number of characters at the end of the filename
with the resulting hash, plus a prefix character like ~ to flag it.  When
the VFS sees that pattern, it knows to convert it to an inode.

Since I have a 48 bit inode, base 48 allows me an estimated 7 characters
and a flag character.  When I bother doing the math for real, I may find
I get get by with 6 characters only.  If I can find more characters legal
to DOS to encode, it would also help.  (The characters do not have to be
legal in the host operating system, just MS-DOS filenames.)

A 32 bit inode should fit in 5 characters plus a flag character.  A 64 bit
inode may have a problem keeping the 8.3 name guaranteed unique.  Too bad
MSDOS is not case sensitive on filenames, it would give me 26 more
characters to play with.  (I do not know if it safe to play with the
8 bit characters.  Someone might have to enter the 8.3 name from the
keyboard.

This elimates having to seach the potentially the entire directory when
doing name mangling.  I have reports of some people attempting to serve
thousands of files from a single directory.  If the first 7 characters are
not sufficiently unique, generating mangled names on the fly would be
resource intensive.  Coordinating them in a separate file or database
could also be resource intensive.

It also means that given a mangled name, you can always find the real
filename.  If has moved to a different directory, then the access can be
treated as a file not found.

Since a mangled name on a write should only result from a older utility
that can not handle the long filenames, it may be acceptable to just let
mangled names that do not map to an existing file die.

> Imagine writing an OS-level VFS that supports ACLs, long names, and 8.3 
> names.  You would then mount that VFS layer when the OS boots.  In 
> addition, you would write a Samba-VFS layer that knows how to talk to 
> your OS VFS.  Samba would not do any name mangling, because the 
> underlying FS would do it for you.

Of course there is the real file system path to the file and the VFS path
to the file.  Any accesses that did not go through the VFS would have the
potential or invalidate the attributes set by the accesses through the
VFS.

-John
wb8tyw at qsl.network
Personal Opinion Only.