Locking, notify collisions using CTDB on non-clustered share?

Christopher R. Hertel crh at samba.org
Fri Apr 13 16:48:28 UTC 2018

On 04/13/2018 05:02 AM, Volker Lendecke wrote:
> Hi, Chris!

Hi, Volker!

Answers to your questions, and more details, below.  Thanks!

> On Thu, Apr 12, 2018 at 11:53:30PM -0500, Christopher R. Hertel via samba-technical wrote:
>> I did some digging in Bugzilla but came up empty.  I'm wondering if this is
>> a known issue:
>> - I have a clustered file system.
>> - I have Samba running on <n> nodes of the cluster.
>> - I have one share on each of the <n> nodes that points to an EXT4
>>   filesystem.
>> I'm running the same smbtorture tests against the various EXT4 shares, and
>> getting errors like:
>> - Others that represent different forms of access collisions.
>> CTDB is handling locking and other aspects shared access, so the collisions
>> are probably occurring at the database level rather than the filesystem
>> level.  Further precautions taken:
>> - The share names differ from node to node; no duplicates.
>> - The underlying pathnames also differ.  They're in the form:
>>   /path/to/nodename/share
>> Of course, when clustering is disabled (clustering=no) the errors are no
>> longer produced.  Again, indication that the collisions are occurring at the
>> DB layer.
> Hmm. I think your setup needs some explanation. Why do you have ctdb
> on top of separate ext4's? ctdb was initially designed to take care of
> smb level locking on top of a cluster file system.

In this scenario, we have shares mapped to the clustered file system, but we
also have shares mapped to local file systems.

The 'clustering = ' parameter is a global.  I don't know of a way to
indicate to Samba that some shares should be clustered while other shares
are independent.

I see this as a real-world scenario; some shares are high availability, but
others are not, and the non-HA shares are scattered across the multiple
Samba nodes.  DFS may even be involved just to make it fun.

In my particular case, I am using this configuration to test the underlying
cluster file system.  I run the same smbtorture sub-test against both an
EXT4 share and against a clustered FS share.  Then I compare the output
looking for meaningful differences.

This testing all worked fine when I was targeting a single node in the
cluster.  To speed things up, each node in the cluster exposed an EXT4 share
and the tests were run in parallel across the cluster.  That's when we
started getting various random errors that all indicated collisions of some
sort, but these errors only occurred over the EXT4 shares.

We were careful to give the shares different names:
| Node  | Sharename |
| Foo-1 | [EXT4-1]  |
| Foo-2 | [EXT4-2]  |
| Foo-3 | [EXT4-3]  |

...but the errors occurred anyway.

> What I could imagine is that you have collisions in the inode space on
> the different "nodes", and this might lead to problems.

Yes, I am assuming so.

One thing we tried was also to change the path mapped to the share, as so:
| Node  | Sharename | Local Path       |
| Foo-1 | [EXT4-1]  | /mnt/ext4/ext4-1 |
| Foo-2 | [EXT4-2]  | /mnt/ext4/ext4-2 |
| Foo-3 | [EXT4-3]  | /mnt/ext4/ext4-3 |

My thought was that, perhaps, the full pathname was being used as a key
somewhere.  In fact, yes.  When we made this change one set of errors went
away completely.  The smb2.notify.dir tests now pass without generating
collision errors (which they had done previously).  So that tells me a lot.

Unfortunately, other tests still fail over EXT4.  Collisions still show up.

You are probably exactly right about the inode issue.  I know that, in our
test rig, the (virtual) drives with EXT4 all have the same device number.

I am planning on adding more drives so that the shares that I expose can
have different device numbers.  Another test would be to put all of the EXT4
shares onto just one machine, under different subdirectories, so that the
inode numbers wouldn't have a chance to collide.

> Do you come to SambaXP? There we could have a quick chat about that :-)

I plan on being there.

My thought on this problem is that we should be able to support this kind of
mixed-mode cluster without generating collisions for non-clustered shares.
I imagine, however, that adding parameters or changing the keys to identify
non-clustered shares could be a big lift.

Chris -)-----

More information about the samba-technical mailing list