registry config not being updated on one of my clusters ... how do I troubleshoot this

ronnie sahlberg ronniesahlberg at gmail.com
Wed Dec 10 11:48:53 MST 2014


On Wed, Dec 10, 2014 at 1:45 PM, Richard Sharpe
<realrichardsharpe at gmail.com> wrote:
> On Wed, Dec 10, 2014 at 10:35 AM, Richard Sharpe
> <realrichardsharpe at gmail.com> wrote:
>> On Wed, Dec 10, 2014 at 10:06 AM, ronnie sahlberg
>> <ronniesahlberg at gmail.com> wrote:
>>> On Wed, Dec 10, 2014 at 12:27 PM, Richard Sharpe
>>> <realrichardsharpe at gmail.com> wrote:
>>>> On Tue, Dec 9, 2014 at 10:11 PM, Michael Adam <obnox at samba.org> wrote:
>>>>> On 2014-12-10 at 07:09 +0100, Volker Lendecke wrote:
>>>>>> On Wed, Dec 10, 2014 at 07:01:01AM +0100, Michael Adam wrote:
>>>>>> > On 2014-12-10 at 06:55 +0100, Volker Lendecke wrote:
>>>>>> > > On Tue, Dec 09, 2014 at 12:35:34PM -0800, Richard Sharpe wrote:
>>>>>> > > >
>>>>>> > > > Curious.
>>>>>> > > >
>>>>>> > > > When I do ctdb cattdb registry.tdb on both nodes, there is no
>>>>>> > > > difference, but node 1 shows nothing when I do net conf list.
>>>>>> > >
>>>>>> > > Missing include=registry in smb.conf on one node?
>>>>>> >
>>>>>> > That would not affect "net conf list".
>>>>>> >
>>>>>> > "net conf" is just a special path to part
>>>>>> > of the registry irrespective of whether
>>>>>> > that part is actually used as configuration.
>>>>>> > (Much like cat and other tools would be used
>>>>>> > to operate on a smb.conf.)
>>>>>>
>>>>>> But a missing "clustering=yes" would affect net conf list...
>>>>>
>>>>> It would, because "net conf list" lists the contents of the
>>>>> smbconf key from registry.tdb. And "clustering = yes" changes
>>>>> how registry is accessed - as local tdb file or as clustered
>>>>> database via ctdb.
>>>>
>>>> However, this is the content of the smb.conf on both nodes, and
>>>> certainly on the one where the listing is not happening:
>>>>
>>>> [global]
>>>>         clustering = yes
>>>>         config backend = registry
>>>>
>>>
>>> On the broken node,
>>> check in /proc/<smbd>/fd and /proc/<ctdbd>/fd
>>> that they both point to the same file for the tdb files
>>
>> Hmmm, how do I know which one points to which tdbs?
>
> On the node that I am having problems I see the following:
>
> # ctdb status
> Number of nodes:2
> pnn:0 172.16.170.120   OK
> pnn:1 172.16.170.121   OK (THIS NODE)
> Generation:1188776196
> Size:2
> hash:0 lmaster:0
> hash:1 lmaster:1
> Recovery mode:NORMAL (0)
> Recovery master:0
> root at localhost:~
> # ps ax | grep ctdb
>  6634 ?        SLs    0:00 /usr/sbin/ctdbd
> --reclock=/hf-smb/ctdb/reclock --pidfile=/var/run/ctdb/ctdbd.pid
> --logfile=/var/log/log.ctdb --nlist=/etc/ctdb/nodes
> --public-addresses=/etc/ctdb/public_addresses -d NOTICE
>  6816 ?        S      0:00 /usr/sbin/ctdbd
> --reclock=/hf-smb/ctdb/reclock --pidfile=/var/run/ctdb/ctdbd.pid
> --logfile=/var/log/log.ctdb --nlist=/etc/ctdb/nodes
> --public-addresses=/etc/ctdb/public_addresses -d NOTICE
> 12569 pts/0    S+     0:00 grep ctdb
> root at localhost:~
> # lsof | grep ctdb | grep registry
> ctdbd      6634    root  mem       REG              253,3  1310720
> 2360023 /var/lib/ctdb/persistent/registry.tdb.1
> ctdbd      6634    root   19u      REG              253,3  1310720
> 2360023 /var/lib/ctdb/persistent/registry.tdb.1
> ctdb_reco  6816    root  mem       REG              253,3  1310720
> 2360023 /var/lib/ctdb/persistent/registry.tdb.1
> ctdb_reco  6816    root   19u      REG              253,3  1310720
> 2360023 /var/lib/ctdb/persistent/registry.tdb.1
> root at localhost:~
>
>
> Why is the process ctdb_reco holding on to the registry tdb?

It is the recovery daemon. It has all tdbs open in case it need to
read/write it during a recovery.

Pick one of the smbd processes and check with
ls -l /proc/<smbd-pid/fd
that it too has /var/lib/ctdb/persistent/registry.tdb.1  open
(and that it does not have a different  registry.tdb file open)


>
> There is a log entry in /var/log/log.ctdb about a travers on that file
> starting and ending and an entry about attaching to that database.
>
>
> --
> Regards,
> Richard Sharpe
> (何以解憂?唯有杜康。--曹操)


More information about the samba-technical mailing list