[Samba] Samba fsmo/demote/unjoin trouble after crash

Giedrius giedrius+samba at su.lt
Tue May 28 01:44:49 MDT 2013


Fixed this mess.....
If anybody else needs this:
    1) samba_backup on working good DC :)
    2) rm -rfv private/* var/{lock,locks}/*.{t,l}db on bad server
    3) rejoin with the same name *and* the same site it was on
    4a) TRY to demote: this will luckily work.... but not for me
    4b)  samba-tool dbcheck --cross-ncs --fix --yes
            Search for registered DC'as:             ldbsearch
"(invocationid=*)" objectguid
            Search for entries of your bad DC:   ldbsearch
"(objectguid=<GUID_FROM_BAD_SERVER>")
            Here I've got only 1 entry: that is NTDS settings (maybe
there should be more?)
            Only after i've deleted NTDS settings, I *was* *able* to
delete server from database (with windows DSA tools)
            ldbdel "CN=NTDS
Settings,CN=<SERVER_NAME>,CN=Servers,CN=<SITE_NAME>,
CN=Sites,CN=Configuration,<YOUR_DOMAIN in form DC=DOMAIN,DC=EXAMPLE,DC=COM>"
            You now *can* delete the server from sites & services AND
computers & users
            samba-tool dbcheck --cross-ncs --fix --yes (haven't got any,
but who knows.
      5) Rejoin your bad server again (if it *is* needed)
      6) Everything is working flawlessly now.

    Side note:
            ldbsearch / ldbedit / ldbdel DID NOT WOTK for me with
kerberos (-k yes), though kinit is fine, so use it like this:
            ldbsearch -UAdministrator --password <your password>
--cross-ncs ldap://localhost ..............
            All ldb* and dbcheck commands were run from *running**good DC*

            If dbcheck complains about bad owner GUID on NTDS Settings,
you might have dublicated msDS-hasMasterNCs..... and dbcheck is *NOT*
fixing this.
            Just delete duplicated lines (for me this was ForestDnsZones
and DomainDnsZones) with ldbedit... otherwise samba will keep crashing
with SIGSEGV

            One of the DC's was not able to replicate after first rejoin
- delete was needed
            Double / tripple or even more *check the netbios name= in
your smb.conf* - this is how i've got 2 DC names in the database (but
only 1 join)

            Demote *will not work*, if your bad server has DNS zones
configured (on SAMBA LDAP)
            Demote complains about *2 roles still on server,* but no
list witch ones (presumably the ForestDnsZones and DomainDnsZones)

    Thanks all for help

2013.05.21 00:46, Andrew Bartlett rašė:
> On Wed, 2013-05-15 at 10:09 +0300, Giedrius wrote:
>> 2013.05.14 18:48, Denis Cardon rašė:
>>> Hi Giedrius,
>>>
>>>>      i've got initial setup on DC1 (4.0.1)... all working good and
>>>> flawless
>>>>      Added additional geographically distributed controllers (DC2, DC3,
>>>> DC4,DC5) with 4.0.5 - no problem.
>>>>      All PC's can connect to their own site/DC
>>>>
>>>>      Transferred all FSMO's to DC2  - transferred successfully (with
>>>> seize "error" bug)
>>>>      DC1 crashed badly....  during maintenance, SAMBA was updated to
>>>> 4.0.5, data restored from backup.
>>>>
>>>>      Now, the problem is:
>>>>          1) DC1 sees itself as owner of all FSMO's, although DC[2,3,4,5]
>>>> sees DC2 as owner of FSMO's
>>>>          3) DC1 is missing some users (created between backup and crash),
>>>> wbinfo for these users return E_DOMAIN_NOT_FOUND
>>>>          4) Got "decrypt integrity check failed"  errors, fixed with
>>>> chtdcpass, witch not results to "Failed to find HOST$#DOMAIN(kvno)"
>>>> (client reboot seems to fix this)
>>>>          4) any attempt to replicate missing information from DC2/DC3 to
>>>> DC1  (samba-tool drs replicate) results in errors after it (cannot find
>>>> own NTDS)
>>>>          5) impossible to demote / unjoin server and provision from
>>>> scratch - some DRS errors
>>>>
>>>>      Question is:
>>>>          how can i change FSMO owner (ldbedit ?) on DC1 to be DC2 and
>>>> then:
>>>>               a) replicate missing users (and computer trust accounts)
>>>> to DC1
>>>>               b) force removing DC1 from domain for good ( reinstall from
>>>> scratch )
>>>>
>>>>      Domain as a whole recreation from scratch is sadly *not* an
>>>> option :(
>>> On https://wiki.samba.org/index.php/Backup_and_Recovery#General it is
>>> clearly stated that you shouldn't restore a DC from backup in a multi DC
>>> environment.
>> Ok, my bad.
>>
>>> Others DC have evolved since you backed up your data, and you cannot
>>> have synchronisation with the other DCs. It is not a Samba problem, but
>>> it is by design because the multi master replication between DCs.
>>>
>>> You should just re-install samba4 4.0.5 on your DC1 server, and then
>>> join it to the domain as a DC, it will synchronise and all will be back
>>> to normal.
>>>
>> But how do i force remove the old server from domain ? (Windows tools
>> and samba's net unjoin failed)
> Just re-join it with the same name, that does as much as we can do.  It
> isn't perfectly ideal, but it should be good enough. 
>
> Andrew Bartlett
>



More information about the samba mailing list