posix_fallocate/fallocate (commit 716ea734e4cd83a2030ca2cac10056bdaab1a021)

Jeremy Allison jra at samba.org
Sat Dec 18 11:01:59 MST 2010


On Sat, Dec 18, 2010 at 01:32:21PM +0100, Björn JACKE wrote:
> Hi Jeremy,
> 
> I wrote up this comment for posix_fallocate. For fallocate this is not correct
> as it is a Linux-only call and returns just 0 or 1. So this change is a bit
> misleading now...:

No, you misunderstand the need for the change.

I'm not calling the Linux-only call (yet). The Samba VFS fallocate()
call isn't the same as the Linux fallocate call. It just has a similar
API.

> --- a/source3/modules/vfs_default.c
> +++ b/source3/modules/vfs_default.c
> @@ -848,12 +848,13 @@ static int strict_allocate_ftruncate(vfs_handle_struct *handle, files_struct *fs
>  
>         space_to_write = len - pst->st_ex_size;
>  
> -       /* for allocation try posix_fallocate first. This can fail on some
> +       /* for allocation try fallocate first. This can fail on some
>            platforms e.g. when the filesystem doesn't support it and no
>            emulation is being done by the libc (like on AIX with JFS1). In that
> -          case we do our own emulation. posix_fallocate implementations can
> +          case we do our own emulation. fallocate implementations can
>            return ENOTSUP or EINVAL in cases like that. */
> -       ret = SMB_VFS_POSIX_FALLOCATE(fsp, pst->st_ex_size, space_to_write);
> +       ret = SMB_VFS_FALLOCATE(fsp, VFS_FALLOCATE_EXTEND_SIZE,
> +                               pst->st_ex_size, space_to_write);
>         if (ret == ENOSPC) {
>                 errno = ENOSPC;
>                 return -1;
> 
> maybe it would be a better to leave posix_fallocate and fallocate in separate
> VFS calls. This will make also a single of them easier to be reaplacable in vfs
> modules. I would actually prefer that very much, what do you think?

No this is the wrong VFS API. I thought a great deal about this,
and this is why I really need this change to get into 3.6.0 as
we don't want to change the VFS API in a 3.x.x series.

fallocate is a *superset* of posix_fallocate. You can get
the posix_fallocate behavior by setting the VFS_FALLOCATE_EXTEND_SIZE
mode. You can't get the VFS_FALLOCATE_KEEP_SIZE behavior by
using posix_fallocate.

So we only want one fallocate VFS operation, and it should be
the one that is a superset, not a subset. That way we can get
both kinds of operation with only one additional VFS call.

For platforms that don't have the second behavior we simply
fake it the same way we do now by checking available file
size.

Indeed, we cope with platforms that have neither behavior 
(and no posix_fallocate call) by falling back inside smbd.

The reason it needs to be the way I've currently coded it,
is that I need to make posix opens be sparse by default,
and I need to allow the trans2 ALLOCATE_FILE operation
be independent of the sparse state of the file. This allows
good emulation of the Linux fallocate behavior from the
CIFSVFS client to both Samba and Windows servers without
having to add another posix extension.

As you see, this is something I've gone through in detail,
so please - no changes without discussing it with Volker
and I fist.

Jeremy.


More information about the samba-technical mailing list