[Samba] vfs_shadow_copy2: unmount snapshot while user is restoring from it

Alex Lyakas alex at zadarastorage.com
Wed Feb 10 10:56:31 UTC 2016


Hi Jeremy,

The use case is that we want to automatically rotate the snapshots.
For example, we want to keep 24 hourly snapshots, and every hour we
create a new snapshot and unmount the oldest one. But since customers
can be restoring from the oldest snapshot right now, it is not
deterministic when we will be able to unmount the oldest snapshot. So
at that moment we want all active restore operations to abort with an
error, and then we unmount. This problem should be common to all
storage vendors using the shadow copy module with mounted snapshots.

Thanks,
Alex.


On Wed, Feb 10, 2016 at 2:46 AM, Jeremy Allison <jra at samba.org> wrote:
> On Mon, Feb 08, 2016 at 06:33:25PM +0200, Alex Lyakas wrote:
>> Greetings,
>>
>> I am trying to use vfs_shadow_copy2 with samba samba-4.2.7.
>>
>> I have a share exported at /export/smb400/. A snapshot of the share is on a
>> separate block device, which is mounted at
>> /export_shadows/volume-00000001/@GMT-2016.02.08-11.48.00/. Samba
>> configuration for the share is:
>>
>> vfs objects = shadow_copy2
>> shadow:snapdir = /export_shadows/volume-00000001
>> shadow:fixinodes = yes
>>
>> (Note: I had to pull two patches by Uri Simchoni to make the snapshots
>> mounted outside of the share working [1] and [2]).
>>
>> Everything is working fine, and on Windows the “Previous version" tab allows
>> to restore files from the snapshot.
>>
>> However, when a large file (say 10GB) is being restored, during that time it
>> is not possible to unmount the snapshot block device, because Samba is
>> holding open file descriptors on its mount point. Question: is there a way
>> to forcefully unmount the snapshot block device, such that all ongoing
>> Restore operations will fail?
>>
>> I did some debugging and saw that during the Restore process
>> shadow_copy2_fstat() is being called a lot (by
>> smb_vfs_call_copy_chunk_send). So I tried to return a failure in this
>> function (-1 with errno=ENOENT). I saw that it indeed helps: Restore
>> operation hits an error. But it takes about 10 seconds, until unmount of the
>> snapshot finally succeeds.
>>
>> My question is: is there any other operation in “vfs_fn_pointers” that I can
>> implement or fail to make vfs_shadow_copy2 quickly close all open file
>> descriptors, such that unmount of the snapshot succeeds?
>>
>> Thanks,
>> Alex.
>
> Hmmm. Isn't this the equivalent of cancelling an async
> op in the middle of the operation ? That's notoriously
> hard to do...
>
> Wouldn't this also cause the user to lose data ?
>
> What is the exact use case for this ?



More information about the samba mailing list