registry config not being updated on one of my clusters ... how do I troubleshoot this

Richard Sharpe realrichardsharpe at gmail.com
Wed Dec 10 11:45:37 MST 2014


On Wed, Dec 10, 2014 at 10:35 AM, Richard Sharpe
<realrichardsharpe at gmail.com> wrote:
> On Wed, Dec 10, 2014 at 10:06 AM, ronnie sahlberg
> <ronniesahlberg at gmail.com> wrote:
>> On Wed, Dec 10, 2014 at 12:27 PM, Richard Sharpe
>> <realrichardsharpe at gmail.com> wrote:
>>> On Tue, Dec 9, 2014 at 10:11 PM, Michael Adam <obnox at samba.org> wrote:
>>>> On 2014-12-10 at 07:09 +0100, Volker Lendecke wrote:
>>>>> On Wed, Dec 10, 2014 at 07:01:01AM +0100, Michael Adam wrote:
>>>>> > On 2014-12-10 at 06:55 +0100, Volker Lendecke wrote:
>>>>> > > On Tue, Dec 09, 2014 at 12:35:34PM -0800, Richard Sharpe wrote:
>>>>> > > >
>>>>> > > > Curious.
>>>>> > > >
>>>>> > > > When I do ctdb cattdb registry.tdb on both nodes, there is no
>>>>> > > > difference, but node 1 shows nothing when I do net conf list.
>>>>> > >
>>>>> > > Missing include=registry in smb.conf on one node?
>>>>> >
>>>>> > That would not affect "net conf list".
>>>>> >
>>>>> > "net conf" is just a special path to part
>>>>> > of the registry irrespective of whether
>>>>> > that part is actually used as configuration.
>>>>> > (Much like cat and other tools would be used
>>>>> > to operate on a smb.conf.)
>>>>>
>>>>> But a missing "clustering=yes" would affect net conf list...
>>>>
>>>> It would, because "net conf list" lists the contents of the
>>>> smbconf key from registry.tdb. And "clustering = yes" changes
>>>> how registry is accessed - as local tdb file or as clustered
>>>> database via ctdb.
>>>
>>> However, this is the content of the smb.conf on both nodes, and
>>> certainly on the one where the listing is not happening:
>>>
>>> [global]
>>>         clustering = yes
>>>         config backend = registry
>>>
>>
>> On the broken node,
>> check in /proc/<smbd>/fd and /proc/<ctdbd>/fd
>> that they both point to the same file for the tdb files
>
> Hmmm, how do I know which one points to which tdbs?

On the node that I am having problems I see the following:

# ctdb status
Number of nodes:2
pnn:0 172.16.170.120   OK
pnn:1 172.16.170.121   OK (THIS NODE)
Generation:1188776196
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:NORMAL (0)
Recovery master:0
root at localhost:~
# ps ax | grep ctdb
 6634 ?        SLs    0:00 /usr/sbin/ctdbd
--reclock=/hf-smb/ctdb/reclock --pidfile=/var/run/ctdb/ctdbd.pid
--logfile=/var/log/log.ctdb --nlist=/etc/ctdb/nodes
--public-addresses=/etc/ctdb/public_addresses -d NOTICE
 6816 ?        S      0:00 /usr/sbin/ctdbd
--reclock=/hf-smb/ctdb/reclock --pidfile=/var/run/ctdb/ctdbd.pid
--logfile=/var/log/log.ctdb --nlist=/etc/ctdb/nodes
--public-addresses=/etc/ctdb/public_addresses -d NOTICE
12569 pts/0    S+     0:00 grep ctdb
root at localhost:~
# lsof | grep ctdb | grep registry
ctdbd      6634    root  mem       REG              253,3  1310720
2360023 /var/lib/ctdb/persistent/registry.tdb.1
ctdbd      6634    root   19u      REG              253,3  1310720
2360023 /var/lib/ctdb/persistent/registry.tdb.1
ctdb_reco  6816    root  mem       REG              253,3  1310720
2360023 /var/lib/ctdb/persistent/registry.tdb.1
ctdb_reco  6816    root   19u      REG              253,3  1310720
2360023 /var/lib/ctdb/persistent/registry.tdb.1
root at localhost:~


Why is the process ctdb_reco holding on to the registry tdb?

There is a log entry in /var/log/log.ctdb about a travers on that file
starting and ending and an entry about attaching to that database.


-- 
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)


More information about the samba-technical mailing list