转发: file operation is interrupted whenusing ctdb+nfs
martin at meltin.net
Fri Jan 5 02:13:46 UTC 2018
On Fri, 5 Jan 2018 08:28:52 +1000, ronnie sahlberg
<ronniesahlberg at gmail.com> wrote:
> On Fri, Jan 5, 2018 at 8:00 AM, Martin Schwenke via samba-technical
> <samba-technical at lists.samba.org> wrote:
> > On Thu, 4 Jan 2018 18:32:26 +0800 (CST), <zhu.shangzhong at zte.com.cn>
> > wrote:
> >> There are 3 CTDB nodes and 3 nfs-ganesha servers.
> >> Their IP address is: 192.168.1.10, 192.168.1.11, 192.1.12.
> >> The CTDB public IP address is: 192.168.1.30, 192.168.1.31, 192.168.1.32.
> >> The client IP is 192.168.1.20. The NFS export directory is mounted
> >> to the client with public IP 192.168.1.30.
> >> I checked the CTDB logs, the public IP 192.168.1.30 was moved to
> >> another node(IP: 192.168.1.32)
> >> when the nfs-server(IP: 192.168.1.10) process was killed.
> > OK, that seems good. :-)
> > * When do you see the "stale file handle" message? Immediately when
> > the NFS Ganesha server is killed or after the failover?
> > If it happens immediately when the server is killed then CTDB is not
> > involved and you need to understand what is happening at the NFS
> > level.
> > * Are you able to repeat the test against a single NFS Ganesha server
> > on a single node?
> > This would involve killing the server, seeing what happens to the cp
> > command on the client, checking if the file still exists in the
> > server filesystem, and then restarting the server.
> > If killing the NFS Ganesha server causes the incomplete copy of the
> > file to be deleted without communicating a failure to the client
> > then this could explain the "stale file handle" message.
> > If this can't be made to work then it probably also isn't possible
> > by adding more complexity with CTDB.
> > By the way, if you are able to reply inline instead of "top-posting"
> > then it is easier to respond to each part of your reply. :-)
> hitless NFS failover requires that the NFS filehandles remain
> invariant across the nodes in the cluster.
> I.e. regardless which node you point to, the same file will always map
> to the exact same filehandle.
> (Stale filehandle just means : "I don't know which file this refers
> to" and it would either be caused by the NFS server (Ganesha) losing
> the inode<->filehandle mapping state when Ganesha is restarted
> or it could mean that the underlying filesystem does not have the
> capability to make this possible from the server.)
> GPFS/SpectrumScale does guarantee this for knfs.ko (and Ganesha) as
> long as you are careful and ensure that the fsid for the backend
> filesystem is the same across all the nodes.
> You would have to check if this is even possible to do with cephfs
> since in order to get this guarantee you will need support from the
> backing filesystem.
> There is likely not anything that CTDB can do here since it is an
> interaction between Ganesha and cephfs.
> One way to test for this would be to just do a NFSv3/LOOKUP to the
> same file from several Ganesha nodes in the cluster and verify with
> wireshark that
> the filehandles are identical regardless which node you use to access the file.
> With a little bit of effort, you can even automate this fully if you
> want to add this as a check for automatic testing.
> The way to do this would be to use libnfs, since it can expose the
> underlying nfs filehandle.
> You could write a small test program using libnfs that would connect
> to multiple different ip's/nodes in the cluster, then
> use nfs_open() to fetch a filehandle for the same file on different
> nodes and then just compare the underlying filehandle in the
> libnfs filehandle.
> I don't remember if dereferencing this structure is part of the public
> API or not, and too lazy to check right now, so you might
> need to include libnfs-private.h if not.
Nice summary. Thanks, Ronnie!
... and you can check device#/inode# consistency in the cluster
filesystem like this:
# onnode all stat -c '%d:%i' /clusterfs/data/foo
>> NODE: 10.0.0.31 <<
>> NODE: 10.0.0.32 <<
>> NODE: 10.0.0.33 <<
While Samba provides a way of dealing with inconsistent device#s
not sure if NFS Ganesha also has something like that.
peace & happiness,
More information about the samba-technical