Where's Jeremy ?

Jeremy Allison jra at samba.org
Tue Oct 16 02:04:41 MDT 2012


On Mon, Oct 15, 2012 at 09:21:47PM +0200, Volker Lendecke wrote:
> On Mon, Oct 15, 2012 at 01:54:11PM -0500, Christopher R. Hertel wrote:
> > Jeremy,
> > 
> > Have a great conference.
> > 
> > ...and if you get a moment, can you:
> > 
> > a) Give me a brief overview of your opinions concerning Linux AIO?
> > b) Point me at any docs, write-ups, blog-posts, or e'mail messages
> >    that will allow me to catch up on the subject a bit?
> > 
> > I know that you've been dealing with these issues.  The Gluster team
> > is also looking at Linux AIO and I'd like to know where the
> > dragons--if any--be.
> 
> Quick summary: Doesn't work.
> 
> More explanation: At SDC I talked to Christoph Hellwig, who
> explained what goes on: For pwrite, AIO is pointless. The
> buffer cache makes it async anyway. If the buffer cache is
> full, the kernel is very smart about feeding as much data as
> possible to disk, blocking the pwrite calls. For pread,
> Linux AIO only works for O_DIRECT, and probably always will.
> O_DIRECT brings alignment restrictions. If you use the
> libaio API without O_DIRECT, it does its pread/pwrite
> business correctly, but it won't go async. The reason why
> pread can't be made async with the buffer cache and other
> layers involved is that there's a million places where the
> kernel could block on the way from disk to user space.
> Handling them all correctly and keeping that correct over
> releases is too difficult. So the best we can do is do the
> aio_pthread code.
> 
> Christoph has sent me a very simple patch that would make
> pread on a non-blocking fd return EWOULDBLOCK if the data is
> not in the buffer cache. This way we might get the best of
> both worlds: Only pay the context switch price if there is a
> chance that we would block. I still need to test that, but
> other stuff is more pressing these days.

TL;DR... What Volker said :-).

But please not that Volker here is talking about the Linux
kernel AIO, not the AIO implemented using pthreads inside
glibc (which has a completely different set of problems,
but also doesn't work :-).

Jeremy.


More information about the samba-technical mailing list