[Samba] Multi domain controller environment Ubuntu 12.04, replication and DNS updates broken

Chris Alavoine chrisa at acs-info.co.uk
Wed Oct 15 10:37:15 MDT 2014


Hi all,

Just wanted to update on this.

I now have a nicely working domain. I ended up shutting down and manually
removing my broken DC and then seizing the FSMO roles on another machine.

I now have ~1000 records in my DomainDnsZones ldb db which means joining
any new DC's is very quick and simple process. dbcheck --cross-ncs now
completes nicely in a few minutes.

As an extra note, have tested Samba 4 1.12 on Ubuntu 14.04 LTS and it
appears to work fine without any changes from 12.04.

Thanks,
Chris.

On 3 October 2014 08:30, Chris Alavoine <chrisa at acs-info.co.uk> wrote:

> Apologies, I have given a more detailed explanation earlier in this post
> (and in others), but basically the DomainDnsZones ldb file appears to be
> corrupt in some way.
>
> When I try to add a new DNS record (on the FSMO DC) I get an error that
> "The local security authority database contains an internal inconsitency".
> I know that there are ~450000 "CN=Deleted Objects" records in the ldb but
> it gets stuck at 88159 whenever I try an ldbsearch. DNS still resolves and
> other parts of Samba4 are replicating ok (users/groups/password etc), but
> with regards to DNS I can't get any life from it (and I've tried various
> different tests in a VM test lab; migrating to BIND, manually removing
> records using ldbdel, attempting to upgrade to 4.1.12 and everything in
> between (it's currently on 4.1.7), but all to no avail.
>
> Now I have 2 good working DC's with ~10,000 Deleted Objects records and
> one of them is on 4.1.12, my plan is to rejoin all my outlying DC's via
> these 2 DC's. They are all VM's so cloning and rebuilding isn't really that
> much of laborious task.
>
> My only concern at present is what to do with the FSMO roles, should I:
>
> a: Transfer them to my good working 4.1.12 DC before I decommission the
> broken FSMO DC?
>
> b. Decommission the FSMO DC and seize them?
>
> c. Something else...
>
> Thanks,
> Chris.
>
> On 3 October 2014 07:48, L.P.H. van Belle <belle at bazuin.nl> wrote:
>
>>  can you explain the " very sorry state " of you DC, maybe this is
>> fixable, and isn't a rebuild needed.
>>
>> Best regards,
>>
>> Louis
>>
>>  ------------------------------
>> *Van:* Chris Alavoine [mailto:chrisa at acs-info.co.uk]
>> *Verzonden:* donderdag 2 oktober 2014 19:09
>> *Aan:* L.P.H. van Belle
>> *CC:* samba at lists.samba.org; kseeger at samba.org
>> *Onderwerp:* Re: [Samba] Multi domain controller environment Ubuntu
>> 12.04, replication and DNS updates broken
>>
>>  Hi all,
>>
>> Update.
>>
>> I have managed to get things to a more stable position. My best DC (in
>> New York) got down to 13,000 records today so I decided to try and join a
>> new DC from London using 4.1.12 (Ubuntu 12.04) and it worked like a charm.
>> I now have a very snappy 4.1.12 DC in London which I am going to use to
>> rebuild the domain.
>>
>> My FSMO roles DC is in a very sorry state so I shall take this down
>> during a maintenance period and rebuild and rejoin it (this IP is used a
>> lot throughout the company so we can't be without it for long).
>>
>> Thanks for all the help on the lists.
>>
>> Chris.
>>
>> On 1 October 2014 11:33, Chris Alavoine <chrisa at acs-info.co.uk> wrote:
>>
>>> Hi Louis,
>>>
>>> Many thanks for replying.
>>>
>>> Unfortunately, I can't seem to upgrade past 4.1.8 on Ubuntu 12.04. Am
>>> currently running 4.1.7 on 4 of my DC's and 4.1.8 on one of them. If I try
>>> to upgrade to 4.1.12 for instance Samba refuses to start. My only theory at
>>> this stage is that when I originally provisioned my domain
>>> (classic_upgrade) I created a new OU purely for Groups and moved all groups
>>> into it (including default AD ones like Domain Admins, Domain Users, Domain
>>> Computers, DnsAdmins etc). Could this be causing me issues? I know that
>>> migrating to Bind fails if DnsAdmins is not in the Users OU. I am slightly
>>> loathe to start moving things around for fear of wreaking more havoc though.
>>>
>>> My biggest problem is getting the DomainDnsZones ldb file to a
>>> manageable size. My worst case is ~450000 records, but there are anomalies
>>> throughout the domain as replication for DNS appears to be busted. On one
>>> of my DC's when I do an:
>>>
>>> ldbsearch -H DC=DOMAINDNSZONES,DC=EXAMPLE,DC=COM.ldb 'isDeleted' dn
>>>
>>> I am seeing:
>>>
>>>  ltdb: tdb(DC=DOMAINDNSZONES,DC=EXAMPLE,DC=COM.ldb): tdb_rec_read bad
>>> magic 0xd9fee666 at offset=1663653516
>>>
>>> search error - Indexed and full searches both failed!
>>>
>>>
>>> This looks like a bad thing to me and any mention of this error on the
>>> lists seems to suggest corruption...
>>>
>>> My main FSMO roles DC just stops at 88159 records and appears to be
>>> stuck there.
>>>
>>> I have lowered tombstoneLifetime to 15 on all DC's using ADSI to try and
>>> get at least one decent DC with a low number of records.
>>>
>>> My best-looking DC (based in New York on a point to point fibre link) is
>>> currently down to 123457 records and getting lower so tombstone is having
>>> an effect there. I am going to wait a few days until this is down to a
>>> reasonable size and then attempt to join a new DC here, shutdown main FSMO
>>> roles DC and other broken DC's, seize the roles and rebuild from there.
>>> Much like this post:
>>>
>>>
>>> http://lists-archives.com/samba/79874-samba4-replication-issues-sam-ldb-inconsistency.html
>>>
>>> Thanks,
>>> Chris.
>>>
>>>
>>>
>>>
>>> On 1 October 2014 11:16, L.P.H. van Belle <belle at bazuin.nl> wrote:
>>>
>>>> ah..
>>>>
>>>>  DeletedObjects ... and replication errors.
>>>>
>>>> This is a known samba 4 bug.
>>>> see also :  https://bugzilla.samba.org/show_bug.cgi?id=10398
>>>> Look at the post : No objectClass found in replPropertyMetaData *(was
>>>> thread :replication issues solved by adding GUID name ... )
>>>> by me. ;-)  today an old e-mail entered the mailing list, which
>>>> involves the problem you discribe.
>>>>
>>>> I dont know it the fix in in the latest samba release yet.
>>>> maybe someone of samba knows.
>>>>
>>>> Karolin can you answhere this? or pass this to someone who knows.
>>>>
>>>>
>>>> Louis
>>>>
>>>>
>>>> >-----Oorspronkelijk bericht-----
>>>> >Van: chrisa at acs-info.co.uk
>>>> >[mailto:samba-bounces at lists.samba.org] Namens Chris Alavoine
>>>> >Verzonden: woensdag 1 oktober 2014 9:31
>>>> >Aan: samba at lists.samba.org
>>>> >Onderwerp: [Samba] Multi domain controller environment Ubuntu
>>>> >12.04, replication and DNS updates broken
>>>>  >
>>>> >Hi all,
>>>> >
>>>> >Am posting this again with a more helpful subject line...
>>>> >
>>>> >My 5 DC production domain (4.1.7 Ubuntu 12.04) is in a bit of a state.
>>>> >
>>>> >I attempted an upgrade from 4.1.5 to 4.1.7 which appeared to
>>>> >work, but now
>>>> >we have replication errors and am unable to add any new DNS
>>>> >entries. I am
>>>> >now certain that we've fallen foul of the DomainDnsZones DeletedObjects
>>>> >problem that I've been reading about in various posts on the lists.
>>>> >
>>>> >My DC=DOMAINDNSZONES,DC=EXAMPLE,DC=INTERNAL,DC=COM.ldb files are now
>>>> >between 3 and 4GB on each of the DC's. Doing an ldapsearch (
>>>> >ldbsearch -H
>>>> >DC=DOMAINDNSZONES,DC=ESSENCE,DC=INTERNAL,DC=COM.ldb
>>>> >'isDeleted=TRUE' dn )on
>>>> >each DC returns a different number of objects ranging from
>>>> >387000 down to
>>>> >88000 on the FSMO DC. Almost all of these are stale isDeleted entries.
>>>> >
>>>> >I have lowered the tombstoneLifetime setting as suggested by
>>>> >other posters
>>>> >on the lists and this appears to be slowly (very slowly) lowering the
>>>> >number of records within the ldb domaindnszones file, my hope
>>>> >is that they
>>>> >will lower sufficiently so that I can join a new working 4.1.12 DC to
>>>> >domain.
>>>> >
>>>> >I am currently attempting a Bind migration on a test DC as
>>>> >this is toted as
>>>> >a possible fix (any successes out there with this?).
>>>> >
>>>> >A matter of note for the lists: When I originally provisioned my domain
>>>> >(classic upgrade from Samba3) I created a new OU for Groups
>>>> >and moved all
>>>> >groups into it, this is a mistake if you want to migrate to Bind as the
>>>> >migration script needs CN=DnsAdmins to be in Users OU, if it isn't the
>>>> >script errors. I moved DnsAdmins back to Users to get the script to
>>>> >complete.
>>>> >
>>>> >At present I'm holding the domain together with bits of string
>>>> >and sticky
>>>> >tape - having to reboot one of my DC's every 30 mins just to
>>>> >keep things
>>>> >ticking over.
>>>> >
>>>> >I have tried many variations of joining a new DC to the domain
>>>> >but that has
>>>> >failed, so my current plan is to create a test version of my
>>>> >FSMO DC using
>>>> >BIND_DLZ (using a current snapshot of the FSMO DC) and get things to a
>>>> >working state there, and then replace this on the production site and
>>>> >re-join new DC's to rebuild things. Obviously, not best practice but I
>>>> >can't think of any other way of getting things stable again.
>>>> >
>>>> >I have tried manually editing the .ldb files but they are so
>>>> >inflated now
>>>> >that any vim edits just time out and error.
>>>> >
>>>> >Thanks,
>>>> >Chris.
>>>> >
>>>> >--
>>>> >ACS (Alavoine Computer Services Ltd)
>>>> >Chris Alavoine
>>>> >mob +44 (0)7724 710 730
>>>> >www.alavoinecs.co.uk
>>>> >http://twitter.com/#!/alavoinecs
>>>> >http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>> >--
>>>> >To unsubscribe from this list go to the following URL and read the
>>>> >instructions:  https://lists.samba.org/mailman/options/samba
>>>> >
>>>> >
>>>>
>>>>
>>>
>>>
>>> --
>>> ACS (Alavoine Computer Services Ltd)
>>> Chris Alavoine
>>> mob +44 (0)7724 710 730
>>> www.alavoinecs.co.uk
>>> http://twitter.com/#!/alavoinecs
>>> http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>
>>
>>
>>
>> --
>> ACS (Alavoine Computer Services Ltd)
>> Chris Alavoine
>> mob +44 (0)7724 710 730
>> www.alavoinecs.co.uk
>> http://twitter.com/#!/alavoinecs
>> http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>
>>
>
>
> --
> ACS (Alavoine Computer Services Ltd)
> Chris Alavoine
> mob +44 (0)7724 710 730
> www.alavoinecs.co.uk
> http://twitter.com/#!/alavoinecs
> http://www.linkedin.com/pub/chris-alavoine/39/606/192
>



-- 
ACS (Alavoine Computer Services Ltd)
Chris Alavoine
mob +44 (0)7724 710 730
www.alavoinecs.co.uk
http://twitter.com/#!/alavoinecs
http://www.linkedin.com/pub/chris-alavoine/39/606/192


More information about the samba mailing list