UFS Directory speed (was: Samba Speed Question)

David Collier-Brown David.Collier-Brown at canada.sun.com
Tue Oct 10 12:34:59 GMT 2000

Dave Dezinski wrote:
> What still doesn't make sense to me is that I saw no noticable
> improvement in speed using the ReiserFS with Samba vs ext2,
> but did notice a difference in speed when cp'ing the files in the
> ReiserFS vs ext2 filesystem.

	Hmmn: let's expand on that.
	I'd expect speed improvements when copying and creating
	files, and them to show up when the client is copying a
	large number of (small) files.

	I wouldn't expect to see much speed difference in 
	everyday use **unless** the load on the disk is high, 
	at which time the metadata-writing cost of ufs started
	eating the available bandwidth and dragging the system
	towards a halt (;-))

> I guess without having a filesystem that does some sort of directory
> entry caching, there is no other way around this. 

	I admit to being puzzled by the slowness of this: all
	the Unixes cache directory blocks and inodes for the
	directories they read, so a re-read should actually be
	pretty fast! Yet it's unimpressive.

>					 I really wish there
> was a better way to handle this, I understand that having 10000 or
> more files in a directory is not that efficient, but it since it wasn't a
> problem in NT or in the previous Netware servers we had I couldn't
> see why it would be a problem here.

	The common case with NT and Netware is that to
	search a directory takes only one to two reads
	of the b-tree indices, and one read of the final
	location to see if the file's there.  This means
	that creation doesn't require a linear search of
	the whole directory, just to ensure the file
	isn't already there.  This is such a common
	operation that those filesystems gain speed
	**disproprately** from it.

	It isn't prefect, as I said:
> > NT will run out of speed on very deep b-tree
> > structures, which fortunately are rare, although
> > programs generating filenames sometimes stumble into
> > the "bad" part of the namespace. So would hashing
> > filesystems, if anyone built them.

	Alas, I'm not a good enough hacker to build a 
	Solaris kernel with hashed directories, so
	I can't give anything but guesses about how
	much improvement we would actually see. I'm
	a performance guy, as it happens, and I lack
	the data to estimate credibly.  In-credibly I'd
	estimate another 30% or so (;-))

> Thanks for the reply, looks like I'll have to look into changing some
> of our applications to handle the spliting up of these files into multiple
> directories if Samba is the way I'm going. :-)

	Ok, send me a description of the naming system
	in email and I'll see if I can suggest some rules
	for the list(s) and some code templates to make
	it reasonably easy...

--dave (the latter task is part of my real job) c-b
David Collier-Brown,  | Always do right. This will gratify some people
185 Ellerslie Ave.,   | and astonish the rest.        -- Mark Twain
Willowdale, Ontario   | //www.oreilly.com/catalog/samba/author.html
Work: (905) 415-2849 Home: (416) 223-8968 Email: davecb at canada.sun.com
David Collier-Brown,  | Cherish your enemies.  They're harder to
Performance & Eng.    | come by than friends and more motivated.
Sun Canada (ACE)      | davecb at canada.sun.com
(905) 415-2849        | http://elsbeth.canada.sun.com/~davecb

More information about the samba mailing list