[PATCH] samba-tool throws error if there is an empty FSMO role

Rowland Penny repenny241155 at gmail.com
Mon May 2 19:41:38 UTC 2016


On 02/05/16 16:34, Rowland Penny wrote:
> On 08/04/16 22:01, Rowland Penny wrote:
>> On 08/04/16 21:30, Andrew Bartlett wrote:
>>>
>>>
>>> All of the above is entirely possible in 'make test'. Indeed, I missed
>>> it in my first search, much of that part is tested here:
>>> source4/torture/drs/python/fsmo.py
>>
>> I have had a look, it looks like something to build on.
>>
>>>
>>> However, all I'm asking for in this case is that you show that 'samba
>>> -tool fsmo show' produces non-faulting output in all of the
>>> environments we already have, and the output is reasonable. Bonus
>>> points if it checks the results are correct.
>>>
>>
>> I will try.
>>
>>>> Would you like to advise me just how I could that without ending up
>>>> with
>>>> a test that wasn't several times the size of fsmo.py and wouldn't
>>>> take
>>>> an excessive time to run.
>>> Tests are often the same size as the code they test, and I don't expect
>>> it to take more than trivial time to run.
>>>
>>> Have a look at timecmd.py and rodc.py.  See how they assert that the
>>> command ran successfully, and see how they check for strings in the out
>>> (output) and err (stderr) variables?
>>>
>>> That is what you need to do.  Then run that in the different
>>> environments by changing source4/selftest/tests.py to run that test in
>>> each of fl2000dc, fl2003dc, fl2008r2dc and vampire_dc
>>
>> I can only try.
>>
>>>
>>>> I personally think that testing fsmo.py in the way you suggest is a
>>>> waste of time, if everything is created correctly (by code that isn't
>>>> in
>>>> fsmo.py) then fsmo.py will work without my changes, but it seems that
>>>> there are times when everything isn't created correctly and then
>>>> fsmo.py
>>>> throws an error.
>>> Indeed.  This happens in the real world.  It is great that you patched
>>> it - I already needed to point a client at the patch when they hit this
>>> exact issue!
>>
>> Oh great, you will use my code, but won't put it into Samba :-D
>>
>>>> Tests are all well and good, but only for things that are created
>>>> automatically. The code in fsmo.py shows, transfers or seizes FSMO
>>>> roles, it doesn't (in the first instance) create the owners of these
>>>> roles, bearing this in mind, perhaps the tests you ask for should be
>>>> aimed at other code.
>>>>
>>>> Please bear in mind, whilst we are arguing about this, there is
>>>> faulty
>>>> code in fsmo.py.
>>> I agree, and I would like it fixed.  I'm also trying to teach you how
>>> to write tests for your code, so that it stays that way.
>>
>> I am a mechanic by trade, one of the first things you learn, if isn't 
>> broken, you do not fix it, this is why I am finding it hard to 
>> envisage the code to break something and then test for the breakage. :-)
>>
>>>
>>> I know I'm asking you to spend non-trivial additional time on this.  I
>>> realise that I'm stretching you and I know that is frustrating.
>>>
>>> Most of us who work on Samba find ourselves spending as much time on
>>> the tests as the original code, and that has thankfully become part of
>>> our culture, and what makes Samba as great as it is.
>>>
>>> My hope is that you can learn the art and habit of automated testing,
>>> because you have shown great ability to learn the Samba craft already,
>>> and it makes your patches much, much, easier to accept.
>>
>> OK, I understand where you are coming from and lets see where I go 
>> from here, I normally just want to fix things :-)
>>
>> Rowland
>
> So, I finally got time to look at this again. There are no DNS zones 
> in my provisioned as a 2000 test domain and replication wasn't 
> working, so I tried to demote the second DC, it wouldn't. I then tried 
> to the demote with ' --remove-other-dead-server' and got this:
>
> root at dc2000a:~# samba-tool domain demote 
> --remove-other-dead-server=dc2000b -Uadministrator
> Removing nTDSConnection: 
> CN=155f1ecb-34bb-4a74-9f76-863282c3af28,CN=NTDS 
> Settings,CN=DC2000A,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=samba,DC=test,DC=tld
> Removing nTDSDSA: CN=NTDS 
> Settings,CN=DC2000B,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=samba,DC=test,DC=tld 
> (and any children)
> Removing computer account: CN=DC2000B,OU=Domain 
> Controllers,DC=samba,DC=test,DC=tld (and any child objects)
> ERROR(ldb): uncaught exception - 
> ../source4/dsdb/samdb/ldb_modules/repl_meta_data.c:3381: Failed to 
> remove backlink of serverReferenceBL when deleting 
> CN=DC2000B,OU=Domain Controllers,DC=samba,DC=test,DC=tld
>   File 
> "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/__init__.py", 
> line 175, in _run
>     return self.run(*args, **kwargs)
>   File 
> "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/domain.py", 
> line 720, in run
>     remove_dc.remove_dc(samdb, logger, remove_other_dead_server)
>   File 
> "/usr/local/samba/lib/python2.7/site-packages/samba/remove_dc.py", 
> line 423, in remove_dc
>     remove_dns_account=True)
>   File 
> "/usr/local/samba/lib/python2.7/site-packages/samba/remove_dc.py", 
> line 351, in offline_remove_ntds_dc
>     remove_dns_account=remove_dns_account)
>   File 
> "/usr/local/samba/lib/python2.7/site-packages/samba/remove_dc.py", 
> line 251, in offline_remove_server
>     samdb.delete(computer_dn, ["tree_delete:0"])
> A transaction is still active in ldb context [0x141be20] on 
> tdb:///usr/local/samba/private/sam.ldb
>
> I will kill the two DCs and start again, after I upgrade from 4.4.0 to 
> 4.4.3
>
> Rowland
>

OK, I have now got two test DCs running 4.4.3, first one provisioned as 
2000 with BIND_FLATFILE and it has no DNS zones in AD. The second has 
been joined to the first with DNS NONE.

Replication is not working i.e.

CN=Configuration,DC=samba,DC=test,DC=tld
     Default-First-Site-Name\DC2000B via RPC
         DSA object GUID: f3c576af-a972-4b40-9a35-10bb21e3bf1e
         Last attempt @ Mon May  2 20:08:38 2016 BST failed, result 2 
(WERR_BADFILE)
         489 consecutive failure(s).
         Last success @ NTTIME(0)

I am rapidly coming to the conclusion that you would have to be 
completely stupid to provision Samba as 2000 AD DC, it just doesn't work!

As for a test for my code, I have thought about this, surely if 
everything is correct then the code will work and if it doesn't work, it 
is because there is a fault that has nothing to do with the fsmo code.

How do I test if there is a FSMO roleowner ? I run 'samba-tool fsmo 
show' and with my code it will now tell me. Now I know there isn't an 
owner for an FSMO role, what do I do ? I use  'samba-tool fsmo seize 
<role> --force', if it fails, the code will tell me why i.e. there is no 
DNS zone in AD.

So Andrew, how do I write a test for something that tests itself every 
time it runs ??? and if there are errors, these errors will probably 
have nothing to do with my code

Or to put it another way, I cannot think how to write a test for fsmo.py 
that doesn't replicate how I tested it before I sent my patch and there 
is no point in doing the same test over and over again, on the off 
chance it will develop an error.

Rowland



More information about the samba-technical mailing list