[Samba] Maximum monitor timeout count 20 reached. Making node unhealthy

Isaac Stone isaac.stone at som.com
Wed Apr 7 03:31:33 UTC 2021


Running clustered samba + ctdb, pushing our new system from dev to prod and
ran into this issue. Never saw in dev and staging in six months of testing,
no idea what it means

We are running a cluster of only one node while we transfer the production
data to the new system, so the box complaining is the only box that exists
as far as ctdb knows (the only entry in the nodes file is itself)

Was down for an hour and a half today repeating every ~45 seconds

"Maximum monitor timeout count 20 reached. Making node unhealthy"

then it recovered

do idea what happened and google is empty

Samba version 4.13.7-SerNet-RedHat-11.el8
CTDB version 4.13.7-SerNet-RedHat-11.el8

ctdb statistics

CTDB version 1
Current time of statistics  :                Wed Apr  7 03:29:46 2021
Statistics collected since  : (000 09:38:34) Tue Apr  6 17:51:12 2021
 num_clients                       22
 frozen                             0
 recovering                         0
 num_recoveries                     1
 client_packets_sent          1713374
 client_packets_recv          3294796
 node_packets_sent            2582766
 node_packets_recv                  0
 keepalive_packets_sent             0
 keepalive_packets_recv             0
 node
     req_call                       0
     reply_call                     0
     req_dmaster                    0
     reply_dmaster                  0
     reply_error                    0
     req_message                34655
     req_control              2078103
     reply_control             470008
     req_tunnel                     0
 client
     req_call                 1208635
     req_message                34661
     req_control              2051500
     req_tunnel                     0
 timeouts
     call                           0
     control                        0
     traverse                       0
 locks
     num_calls                      8
     num_current                    0
     num_pending                    0
     num_failed                     0
 total_calls                  1208635
 pending_calls                      0
 childwrite_calls                   3
 pending_childwrite_calls             0
 memory_used                  1454200
 max_hop_count                      0
 total_ro_delegations               0
 total_ro_revokes                   0
 hop_count_buckets: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 lock_buckets: 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 locks_latency      MIN/AVG/MAX     0.002119/0.002257/0.002505 sec out of 8
 reclock_ctdbd      MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
 reclock_recd       MIN/AVG/MAX     4.100998/4.100998/4.100998 sec out of 1
 call_latency       MIN/AVG/MAX     0.000004/0.000013/0.007420 sec out of
1208635
 childwrite_latency MIN/AVG/MAX     0.000869/0.001203/0.001572 sec out of 3

Any ideas?


More information about the samba mailing list