[PYTHON] Possible bugs revealed by socket_wrapper

Noel Power nopower at suse.com
Tue Nov 13 16:34:31 UTC 2018


On 13/11/2018 15:30, Andreas Schneider wrote:
> If it is really a socket_wrapper bug, why does it work on my opensuse install
> but not work on Ubuntu. And this happens only in python processes. I don't see
> that in any other process.

hmm, you might as well ask why does the bug only appear when it is 
raining on Wednesdays at the same time as I scratch my ear ;-) Sometimes 
that's just the way it is, and like I said I can see something at least 
similar to this on Leap15

On 13/11/2018 16:10, Andreas Schneider wrote:
> On Tuesday, 13 November 2018 15:50:10 CET Noel Power wrote:
>> Hi Andreas
>>
>> I think maybe the bug lies in swrap, looking at a strace of the make
>> testenv I see
>>
>> 25607 fcntl(5, F_GETFL) = 0x2 (flags O_RDWR)
>> 25607 fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0
>> 25607 brk(0x562ecc3e3000)               = 0x562ecc3e3000
>> 25607 getpid()                          = 25607
>> 25607 ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
>> 25607 fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
>> 25607 readlink("/proc/self/fd/0", "/dev/pts/3", 4095) = 10
>> 25607 stat("/dev/pts/3", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3),
>> ...}) = 0
>> 25607 getpid()                          = 25607
>> 25607 write(2, "SWRAP_ERROR[tmux (25607)] - find"..., 132) = 132
>> 25607 close(2147483647)                 = -1 EBADF (Bad file descriptor)
>> 25607 fcntl(0, F_GETFL)                 = 0x8402 (flags
>> O_RDWR|O_APPEND|O_LARGEFILE)
>> 25607 fcntl(0, F_SETFL, O_RDWR|O_APPEND|O_NONBLOCK|O_LARGEFILE) = 0
>> 25607 dup(0)                            = 6
>> 25607 getpid()                          = 25607
>>
>> The close "25607 close(2147483647)                 = -1 EBADF (Bad file
>> descriptor)" seems a bit suspicious and I don't see the high numbers of
>> fd(s) really opened when looking through the strace (I attach the strace
>> for you )
>>
>> strace was generated from
>>
>>        strace -f -o chgdcpass.strace make testenv
>> SELFTEST_TESTENV=chgdcpass SCREEN=1
> How did you get trace working, here it complains that it doesn't have the
> permissions to run it.

just (with your branch, simple ./configure.developer && make) then

a) tmux

b) strace -f -o chgdcpass.strace make testenv SELFTEST_TESTENV=chgdcpass 
SCREEN=1

c) in the tmux window (well my shell with the tmux green status bar at 
the bottom)

I get the chgdcpass environment set up (presumably that is what is 
expected,  i dunno, previously I only have ever used 
SELFTEST_TESTENV=blah (without SCREEN)) (see attached, just before I hit 
return on exit)

>
> You should trace:
>
> python ./bin/samba-tool domainprovision --configfile=/home/gitlab-runner/
> samba/st/chgdcpass/etc/smb.conf --host-name=chgdcpass --host-ip=127.0.0.32 --
> quiet --domain=CHDCDOMAIN --realm=CHGDCPASSWORD.SAMBA.EXAMPLE.COM --domain-
> sid=S-1-5-21-1879926001-2232171230-2460190490 --adminpass=chgDCpass1 --
> krbtgtpass=krbtgtchgDCpass1 --machinepass=machinechgDCpass1 --root=gitlab-
> runner --server-role=domain controller --function-level=2008 --dns-
> backend=BIND9_DLZ
>
why?, simply exiting show the error for me and generated the close of 
illegal maxint value, if you can find a simpler reproduce then surely 
that is the one to chase ? I will try the python command above though 
and see if I can generate a strace for you


Noel

>> On 13/11/2018 14:38, Noel Power via samba-technical wrote:
>>> hmm, I get something similar and believe I can reproduce this on leap 15
>>>
>>> using your branch
>>>
>>> tmux
>>>
>>> make testenv SELFTEST_TESTENV=chgdcpass SCREEN=1
>>>
>>> exit # exiting without running any test
>>>
>>> and then I get...
>>>
>>>
>>> [...]
>>>
>>> 127.0.0.32 CHGDCPASS<00>
>>> waiting for working LDAP and a RID Set to be allocated
>>> checking the NETLOGON for domain[CHDCDOMAIN] dc connection to
>>> "chgdcpass.chgdcpassword.samba.example.com" succeeded
>>> SAMBA LOG of: CHGDCPASS pid 25248
>>> SWRAP_ERROR[tmux (25261)] - find_socket_info_index: The max socket
>>> index limit of 262140 has been reached, trying to add 2147483647
>>> SWRAP_ERROR[tmux (25320)] - find_socket_info_index: The max socket
>>> index limit of 262140 has been reached, trying to add 2147483647
>>> teardown_env(chgdcpass)
>>> samba child process 25248 exited with value 0
>>>
>>>
>>> ALL OK (0 tests in 0 testsuites)
>>>
>>> A summary with detailed information can be found in:
>>>    ./st/summary
>>> TOP 10 slowest tests
>>> 'testonly' finished successfully (30.167s)
>>>
>>> On 12/11/2018 17:42, Andreas Schneider via samba-technical wrote:
>>>> Hi,
>>>>
>>>> we have socket_wrapper 1.2.0 with threading support ready. So I've
>>>> pushed the
>>>> changes to the gitlab CI to see if everything is working.
>>>>
>>>> It doesn't work and revealed some bugs :-)
>>>>
>>>> New in socket_wrapper 1.2.0 is that we have a default limit of
>>>> handling 65535
>>>> open sockets! This can be increased if needed. However if we try to
>>>> setup the
>>>> chgdcpass environment samba-tool wants to open 1 million sockets!
>>>>
>>>> It is running:
>>>>
>>>> python ./bin/samba-tool domainprovision
>>>> --configfile=/home/gitlab-runner/
>>>> samba/st/chgdcpass/etc/smb.conf --host-name=chgdcpass
>>>> --host-ip=127.0.0.32 --
>>>> quiet --domain=CHDCDOMAIN --realm=CHGDCPASSWORD.SAMBA.EXAMPLE.COM
>>>> --domain-
>>>> sid=S-1-5-21-1879926001-2232171230-2460190490 --adminpass=chgDCpass1 --
>>>> krbtgtpass=krbtgtchgDCpass1 --machinepass=machinechgDCpass1
>>>> --root=gitlab-
>>>> runner --server-role=domain controller --function-level=2008 --dns-
>>>> backend=BIND9_DLZ
>>>>
>>>> Which after some time starts to report that it reached the limit!
>>>>
>>>> ...
>>>> SWRAP_ERROR[python (20259)] - find_socket_info_index: The max socket
>>>> index
>>>> limit of 65535 has been reached, trying to add 65855
>>>> SWRAP_ERROR[python (20259)] - find_socket_info_index: The max socket
>>>> index
>>>> limit of 65535 has been reached, trying to add 65856
>>>> ...
>>>>
>>>>
>>>> This is either a python bug in Ubuntu 14.04 leaking file descriptors
>>>> (sockets)
>>>> or an issue with our code and that python version.
>>>>
>>>>
>>>>
>>>> I'm not able to reproduce this locally on a modern Linux system,
>>>> Fedora 29!
>>>>
>>>>
>>>>
>>>>
>>>> Reproducer:
>>>> docker run -ti registry.gitlab.com/samba-team/samba:latest /bin/bash
>>>>
>>>> sudo -i
>>>> apt-get install tmux
>>>> logout
>>>>
>>>> tmux
>>>> git clone git://git.samba.org/asn/samba.git
>>>> cd samba
>>>> git checkout master-swrap
>>>> ./configure --enable-developer
>>>> make -j8
>>>>
>>>> make -j8 testenv SELFTEST_TESTENV=chgdcpass SCREEN=1
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: strace.png
Type: image/png
Size: 141424 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20181113/022fcd8c/strace-0001.png>


More information about the samba-technical mailing list