[Samba] suggestions for a "fast" fileserver - 1G / 10G

Emmanuel Florac eflorac at intellique.com
Tue Mar 25 02:56:00 MDT 2014


Le Mon, 24 Mar 2014 21:55:57 -0700 vous écriviez:


> -----
> Note, I found your info excellent.  But had some Q's
> ----
> Well a partial quote from one of the fs experts on the xfs
> list:
> 
>  From the above, I don't see how RAID6 could be faster than RAID0
> unless you are exceeding the card capacity (3.0Gb/s or 6.Gb/ or 12...
> depending on SAS generation).

I suppose you mean "faster than RAID-10" here; RAID-6 is of course
never faster than RAID-0, at best just as fast (usually slightly worse
or less). RAID-10 write throughput is half the total write throughput
of your disks, i. e. if you have 12 drives able of 100 MB/s each, your
top sequentially write speed is 600 MB/. Same goes for top sequential
read, though theoretically it should be possible to use alternate disks
from the mirrors to actually read at full speed (1.2 GB/s); however it
doesn't seem to be the case in most RAID-10 implementations I've looked
at.

On the other hand a RAID-6 with the same exact 12 drives is able to do
1 GB/s writes (10 x 100MB/s) and 1.2 GB/s reads provided that the
controller is able to compute parities fast enough. 

In the case you're mentioning it seems that the problem is with
rewrites. As I wrote in my previous message, rewriting a single block
(not a full stripe) in RAID-6 actually requires 3 single disks reads
and 3 writes, severely impacting performance.

> 
> Theoretically, RAID1 can be faster than RAID0 if the controller keeps
> track of rotational position; i.e. one copy may be closer to read than
> the other.  Cheap cards won't, but some of the LSI cards might given
> their rigourous drive requirements (I got an order of Hitachi Desktars
> by mistake once instead of Ultrastars) 75% wouldn't pass
> initialization -- basically their rotational rates were up to 20% off
> from the rated 7200RPM)

Yes, generally RAID controllers actually use both members of a mirror
for random reads/multiple access reads, however those that I've tested
(I didn't test the latest LSI yet) don't use them to accelerate
sequential reads.

> When you say "8 drives" do you mean 8 data drives, or 8 total drives?
> Cuz if you mean 8 total drives, then I would easily agree, but if you
> are looking at 8-data drives in a raid6 v. raid10,  i can't see how
> a raid6 would beat a raid10 -- at best it would tie, no...?

Sorry not being clear, I meant 8 total drives. Naturally an 8 data
drives RAID-10 (16 drives) should always be faster than an 8 data
drives RAID-6 (10 drives).

> 
> If you only serve file.. but with samba, I have profiles and all
> data / content on the server where it gets modified regularly.  Only
> thing I keep on windows are the programs.  Things that get modified
> regularly get put on the server -- so by definition, it gets alot of
> writes.

Usually on a file server "modifying a file" actually rewrite the whole
file (I mean application files such as word documents and the likes);
and files are big enough nowadays to span a whole stripe or more, so
that the impact of read-modify-write is actually minimal.

Some of my customers serve several hundreds home directories (a whole
university department) from a single 16 or 24 drives RAID-6 array
without any problem.
 
> I've used XFS for about 14-15 years on my linux boxen and have rarely
> had major problems -- but I also have backups.  Additionally, I have
> my boxes on UPS's, so uptime for them when I'm not keeping up
> w/current kernel has been as long a 90-120 days...

I don't pretend XFS is the one and only and best for all usages.
However so far it was the best bet for file serving patterns in my
opinion. 
Of course I'm a bit biased towards xfs because of familiarity, I've
been using it for 20 years on IRIX then Linux :) We could start the XFS
old farts club :)

> Someone mentioned Redhat is going with XFS for their servers -- so is
> Novell/Suse.  If you need more reliability on XFS, you can get it by
> reducing the caching and writeback delays -- until you have it down to
> the performance of ext3/4.  But if you have a reliable system and
> power (UPS), XFS is well worth the trade off.

Yes that's exactly right. XFS isn't meant for cheap hardware and never
was. I include low-end Dell and HP servers (and particularly the
horrible, data eating HP smart array cards) in "cheap hardware".

>  But if you have heavy
> I/O inflight all the time as with a heavily used MTA, it might be
> better to go solid-state anyway, not to mention MTA's aren't XFS's
> forte -- it was build to support heavy I/O of uncompressed video and
> sound recording and production -- it was build for speed for large
> files (large back then, when it was  initially designed, in the early
> 1990's, was in the MB-GB range)... mail messages.. are generally
> smaller and wouldn't benefit nearly as much as other loads...

I set up large storage servers for customers with large files needs, so
yes, video, audio, or scientific data most of the time. Actually for
small files XFS went from abysmal to decent in the last few years. I
even tested a directory with 1 billion files and it worked like a charm
(though you don't want to try "ls" in it :)

Regarding MTAs I've had interesting conversations with free.fr MTA
administrator (19 millions accounts). Entirely different world, RAID-10
all the way :)

> 
> As you can see from my forwarded message, I have been toying w/getting
> a RAID1+0 setup... the expense is a bit icky though... disks haven't
> dropped according to historical trends over the past 3-5 years... or
> rather, they probably have -- but the dollar dropped alot due to heavy
> devaluation in order to pay for the bailout back then ;-(.

Maybe an SSD caching layer (bcache) could help you there?
 
> Will definitely keep your tuning notes and play w/them.  Thanks much
> for sharing your experience.  It is appreciated...
> 

You're welcome.

-- 
------------------------------------------------------------------------
Emmanuel Florac     |   Direction technique
                    |   Intellique
                    |	<eflorac at intellique.com>
                    |   +33 1 78 94 84 02
------------------------------------------------------------------------


More information about the samba mailing list