Possible samba4 / winbind memory leak?
aw-sambalists at silverstream.net.nz
Mon Apr 1 18:45:33 MDT 2013
I've got five samba4 deployments across three sites, all now on release 4.0.4.
Two were set up when samba4 was in alpha stage before it got integrated DNS and smbd filesharing, and control one AD domain each. These sites are running samba4 and named on one IP, and samba3 listening on another IP (Franky-inspired) without issue - all happy.
Three sites are using samba4 with its own internal DNS server and samba3 fileserver both enabled, and with Samba4's libnss_winbind.so.2 copied to /lib64/. All three are on the same domain, one as a DC and two RODCs (on different sites linked by VPN), and each directly serves only about 3-6 users for now. The DC (and to a lesser extent the RODCs) seem to have memory leak induced issues. I'll focus on the DC as it is suffering the worst.
Basically over the course of several days one or two samba processes will grow and eat up most/all of the the 4GB RAM and 4GB swap.
This week I've run it with --leak-report-full, but when I killall samba on the DC the biggest memory eating process doesn't die.
After running killall samba, this is its line of the process that won't die in top, taken about two or three minutes after the killall samba:
top - 10:04:28 up 10 days, 25 min, 2 users, load average: 0.00, 0.99, 2.29
Tasks: 118 total, 1 running, 117 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.2%us, 0.7%sy, 0.0%ni, 92.3%id, 6.9%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 3922892k total, 2575808k used, 1347084k free, 7488k buffers
Swap: 4194296k total, 2053216k used, 2141080k free, 419996k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3268 root 20 0 4122m 1.6g 2656 S 0.0 43.3 1087:51 samba
The high 15min load average is from before the killall samba when things were swapping around a fair bit.
You can see from the amount of free mem vs swap used that it was heavily loaded before the killall (sorry forgot to get a snapshot of top before doing the killall). Interestingly this process is completely idle. Because it won't respond to a SIGTERM, I don't think I'm going to get a leak report from it.
Once the offending samba process was the only one left running, this is the netstat -pln | grep samba output:
unix 2 [ ACC ] STREAM LISTENING 14610435 3268/samba /srv/adsrv/var/run/samba/winbindd/pipe
unix 2 [ ACC ] STREAM LISTENING 14610437 3268/samba /srv/adsrv/var/lib/samba/winbindd_privileged/pipe
This suggests to me there's some memory leak somewhere in the samba winbind code? Note the offending process is not listening on TCP at all.
The leak reports I *did* get contain some sensitive information. The text file comes in at 37MB which is difficult to scrub. Please email me if you want to see them.
Any ideas what might be happening here?
More information about the samba-technical