[Samba] Homes shares randomly dissapear on AD-DC'S

Achim Gottinger achim at ag-web.biz
Wed Jul 23 02:46:15 MDT 2014


Am 15.07.2014 09:18, schrieb Achim Gottinger:
> Am 10.07.2014 12:13, schrieb Achim Gottinger:
>> Am 09.07.2014 12:58, schrieb Achim Gottinger:
>>> Am 09.07.2014 11:29, schrieb Achim Gottinger:
>>>> Am 09.07.2014 11:08, schrieb Jonathan Buzzard:
>>>>> On Wed, 2014-07-09 at 10:42 +0200, Achim Gottinger wrote:
>>>>>
>>>>> [SNIP]
>>>>>
>>>>>>   I use unscd for caching, restarted it but it did not help.
>>>>> I take it that you missed the big warnings not to use nscd in
>>>>> combination with winbind? You are aware that winbind does it's own
>>>>> caching?
>>>>>
>>>>> I would suggest your first port of call is to disable unscd and 
>>>>> see if
>>>>> the problem goes away.
>>>>>
>>>>> JAB.
>>>>>
>>>> Thank you for the tip, disabled it at all four locations. I used 
>>>> unscd also on the main site which always ran rock solid.
>>>>
>>>> Restarting samba on the branches witch winbind/nss issues fixed 
>>>> wbinfo/getent passwd tests for a few minutes but now they do not 
>>>> resolve again. Gotta watch it with unscd disabled now.
>>>> Thinking about downgrading tp 4.1.4 which had had the issues but 
>>>> they appeared only once a week and not every few hours.
>>>>
>>>> achim~
>>>>
>>> Had to restart samba a few more times meanwhile. Was able to make it 
>>> fail running wbinfo -u a few times. Since they servers are all vm's 
>>> with 1GB in the branches i increased the moemory to 3Gb and since 
>>> then i was not able to make samba fail with wbinfo -u. Hope that did 
>>> the trick.
>>>
>> So far no more [homes] drop outs with 3GB memory assigned. Also 
>> wbinfo -u getent passwd work flawless. Skimming thru saved log files 
>> from yesterday trying to find anything memory related but i can not 
>> find anything. Also there are no sings like OOM kills in syslog at 
>> that timeframe.
>> The vm's had 4GB swap space assigned which had shown usage in few MB 
>> range.
>> Would have expected slow down's in speed due to swapping but no 
>> silent dropping of shares if an server runs out of memory.
>>
>> achim
> After it worked on Fr, Sa and Monday, this morning they dissapeared at 
> our main site for the first time. This vm has 6GB memory and 4 cpu 
> cores assigned and it is the first time the [homes] share stopped 
> working. Even after restarting samba wbinfo -u und wbinfo -g takes 
> sometimes up to 30 seconds to enumerate users/groups.
>
> achim~
>
So far the issue reappeared on our main site last friday at around 9am 
and again multiple times today since 9:15am. It did not appear on the 
branches since i increased memory to 3gb.
People start calling that their home directories are not accessible any 
longer. Not all accounts seem to be affected and others can continue to 
work for an while.

wbinfo -u reports "Error looking up domain users".

Reloading samba services does not help i have to restart them. It's 
difficult to track down the issue the server is in production and must 
get back into an working state asap.

Also i noticed wbinfo -u sometimes takes an long time to report results. 
This is an snippet of an strace, showing an few timeouts trying to 
access /var/run/samba/winbindd/pipe.

Any suggestions how i can track the issues down are welcome.

Thanks in advance,
achim~

connect(3, {sa_family=AF_FILE, path="/var/run/samba/winbindd/pipe"}, 110) = 0

poll([{fd=3, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}])

write(3, "0\10\0\0\0\0\0\0\0\0\0\0\306|\0\0\0\10\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2096) = 2096

poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])

read(3, "\250\r\0\0\2\0\0\0\33\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3496) = 3496

poll([{fd=3, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}])

write(3, "0\10\0\0/\0\0\0\0\0\0\0\306|\0\0\0\10\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2096) = 2096

poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])

read(3, "\313\r\0\0\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3496) = 3496

poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])

read(3, "/var/lib/samba/winbindd_privileg"..., 35) = 35

lstat("/var/lib/samba/winbindd_privileged", {st_mode=S_IFDIR|0750, st_size=4096, ...}) = 0

lstat("/var/lib/samba/winbindd_privileged/pipe", {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0

socket(PF_FILE, SOCK_STREAM, 0)         = 4

fcntl(4, F_GETFL)                       = 0x2 (flags O_RDWR)

fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK)    = 0

fcntl(4, F_GETFD)                       = 0

fcntl(4, F_SETFD, FD_CLOEXEC)           = 0

connect(4, {sa_family=AF_FILE, path="/var/lib/samba/winbindd_privileged/pipe"}, 110) = 0

close(3)                                = 0

poll([{fd=4, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 ([{fd=4, revents=POLLOUT}])

write(4, "0\10\0\0\22\0\0\0\0\0\0\0\306|\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2096) = 2096

poll([{fd=4, events=POLLIN|POLLHUP}], 1, 5000) = 0 (Timeout)

poll([{fd=4, events=POLLIN|POLLHUP}], 1, 5000) = 0 (Timeout)

poll([{fd=4, events=POLLIN|POLLHUP}], 1, 5000) = 0 (Timeout)

poll([{fd=4, events=POLLIN|POLLHUP}], 1, 5000) = 0 (Timeout)

poll([{fd=4, events=POLLIN|POLLHUP}], 1, 5000) = 0 (Timeout)

poll([{fd=4, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=4, revents=POLLIN}])

read(4, "\24\20\0\0\2\0\0\0\236\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3496) = 3496

poll([{fd=4, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=4, revents=POLLIN}])




More information about the samba mailing list