UFS Directory speed (was: Samba Speed Question)
David.Collier-Brown at canada.sun.com
Tue Oct 10 12:34:59 GMT 2000
Dave Dezinski wrote:
> What still doesn't make sense to me is that I saw no noticable
> improvement in speed using the ReiserFS with Samba vs ext2,
> but did notice a difference in speed when cp'ing the files in the
> ReiserFS vs ext2 filesystem.
Hmmn: let's expand on that.
I'd expect speed improvements when copying and creating
files, and them to show up when the client is copying a
large number of (small) files.
I wouldn't expect to see much speed difference in
everyday use **unless** the load on the disk is high,
at which time the metadata-writing cost of ufs started
eating the available bandwidth and dragging the system
towards a halt (;-))
> I guess without having a filesystem that does some sort of directory
> entry caching, there is no other way around this.
I admit to being puzzled by the slowness of this: all
the Unixes cache directory blocks and inodes for the
directories they read, so a re-read should actually be
pretty fast! Yet it's unimpressive.
> I really wish there
> was a better way to handle this, I understand that having 10000 or
> more files in a directory is not that efficient, but it since it wasn't a
> problem in NT or in the previous Netware servers we had I couldn't
> see why it would be a problem here.
The common case with NT and Netware is that to
search a directory takes only one to two reads
of the b-tree indices, and one read of the final
location to see if the file's there. This means
that creation doesn't require a linear search of
the whole directory, just to ensure the file
isn't already there. This is such a common
operation that those filesystems gain speed
**disproprately** from it.
It isn't prefect, as I said:
> > NT will run out of speed on very deep b-tree
> > structures, which fortunately are rare, although
> > programs generating filenames sometimes stumble into
> > the "bad" part of the namespace. So would hashing
> > filesystems, if anyone built them.
Alas, I'm not a good enough hacker to build a
Solaris kernel with hashed directories, so
I can't give anything but guesses about how
much improvement we would actually see. I'm
a performance guy, as it happens, and I lack
the data to estimate credibly. In-credibly I'd
estimate another 30% or so (;-))
> Thanks for the reply, looks like I'll have to look into changing some
> of our applications to handle the spliting up of these files into multiple
> directories if Samba is the way I'm going. :-)
Ok, send me a description of the naming system
in email and I'll see if I can suggest some rules
for the list(s) and some code templates to make
it reasonably easy...
--dave (the latter task is part of my real job) c-b
David Collier-Brown, | Always do right. This will gratify some people
185 Ellerslie Ave., | and astonish the rest. -- Mark Twain
Willowdale, Ontario | //www.oreilly.com/catalog/samba/author.html
Work: (905) 415-2849 Home: (416) 223-8968 Email: davecb at canada.sun.com
David Collier-Brown, | Cherish your enemies. They're harder to
Performance & Eng. | come by than friends and more motivated.
Sun Canada (ACE) | davecb at canada.sun.com
(905) 415-2849 | http://elsbeth.canada.sun.com/~davecb
More information about the samba