[Samba] Multi domain controller environment Ubuntu 12.04, replication and DNS updates broken

L.P.H. van Belle belle at bazuin.nl
Fri Oct 3 00:48:35 MDT 2014


can you explain the " very sorry state " of you DC, maybe this is fixable, and isn't a rebuild needed.
 
Best regards, 
 
Louis

Van: Chris Alavoine [mailto:chrisa at acs-info.co.uk] 
Verzonden: donderdag 2 oktober 2014 19:09
Aan: L.P.H. van Belle
CC: samba at lists.samba.org; kseeger at samba.org
Onderwerp: Re: [Samba] Multi domain controller environment Ubuntu 12.04, replication and DNS updates broken



Hi all, 

Update.


I have managed to get things to a more stable position. My best DC (in New York) got down to 13,000 records today so I decided to try and join a new DC from London using 4.1.12 (Ubuntu 12.04) and it worked like a charm. I now have a very snappy 4.1.12 DC in London which I am going to use to rebuild the domain.


My FSMO roles DC is in a very sorry state so I shall take this down during a maintenance period and rebuild and rejoin it (this IP is used a lot throughout the company so we can't be without it for long).


Thanks for all the help on the lists.


Chris.


On 1 October 2014 11:33, Chris Alavoine <chrisa at acs-info.co.uk> wrote:
Hi Louis, 

Many thanks for replying. 


Unfortunately, I can't seem to upgrade past 4.1.8 on Ubuntu 12.04. Am currently running 4.1.7 on 4 of my DC's and 4.1.8 on one of them. If I try to upgrade to 4.1.12 for instance Samba refuses to start. My only theory at this stage is that when I originally provisioned my domain (classic_upgrade) I created a new OU purely for Groups and moved all groups into it (including default AD ones like Domain Admins, Domain Users, Domain Computers, DnsAdmins etc). Could this be causing me issues? I know that migrating to Bind fails if DnsAdmins is not in the Users OU. I am slightly loathe to start moving things around for fear of wreaking more havoc though.


My biggest problem is getting the DomainDnsZones ldb file to a manageable size. My worst case is ~450000 records, but there are anomalies throughout the domain as replication for DNS appears to be busted. On one of my DC's when I do an:


ldbsearch -H DC=DOMAINDNSZONES,DC=EXAMPLE,DC=COM.ldb 'isDeleted' dn



I am seeing:


ltdb: tdb(DC=DOMAINDNSZONES,DC=EXAMPLE,DC=COM.ldb): tdb_rec_read bad magic 0xd9fee666 at offset=1663653516


search error - Indexed and full searches both failed!





This looks like a bad thing to me and any mention of this error on the lists seems to suggest corruption...


My main FSMO roles DC just stops at 88159 records and appears to be stuck there.


I have lowered tombstoneLifetime to 15 on all DC's using ADSI to try and get at least one decent DC with a low number of records.


My best-looking DC (based in New York on a point to point fibre link) is currently down to 123457 records and getting lower so tombstone is having an effect there. I am going to wait a few days until this is down to a reasonable size and then attempt to join a new DC here, shutdown main FSMO roles DC and other broken DC's, seize the roles and rebuild from there. Much like this post:


http://lists-archives.com/samba/79874-samba4-replication-issues-sam-ldb-inconsistency.html



Thanks,
Chris.








On 1 October 2014 11:16, L.P.H. van Belle <belle at bazuin.nl> wrote:
ah..

 DeletedObjects ... and replication errors.

This is a known samba 4 bug.
see also :  https://bugzilla.samba.org/show_bug.cgi?id=10398
Look at the post : No objectClass found in replPropertyMetaData *(was thread :replication issues solved by adding GUID name ... )
by me. ;-)  today an old e-mail entered the mailing list, which involves the problem you discribe.

I dont know it the fix in in the latest samba release yet.
maybe someone of samba knows.

Karolin can you answhere this? or pass this to someone who knows.


Louis


>-----Oorspronkelijk bericht-----
>Van: chrisa at acs-info.co.uk
>[mailto:samba-bounces at lists.samba.org] Namens Chris Alavoine
>Verzonden: woensdag 1 oktober 2014 9:31
>Aan: samba at lists.samba.org
>Onderwerp: [Samba] Multi domain controller environment Ubuntu
>12.04, replication and DNS updates broken
>
>Hi all,
>
>Am posting this again with a more helpful subject line...
>
>My 5 DC production domain (4.1.7 Ubuntu 12.04) is in a bit of a state.
>
>I attempted an upgrade from 4.1.5 to 4.1.7 which appeared to
>work, but now
>we have replication errors and am unable to add any new DNS
>entries. I am
>now certain that we've fallen foul of the DomainDnsZones DeletedObjects
>problem that I've been reading about in various posts on the lists.
>
>My DC=DOMAINDNSZONES,DC=EXAMPLE,DC=INTERNAL,DC=COM.ldb files are now
>between 3 and 4GB on each of the DC's. Doing an ldapsearch (
>ldbsearch -H
>DC=DOMAINDNSZONES,DC=ESSENCE,DC=INTERNAL,DC=COM.ldb
>'isDeleted=TRUE' dn )on
>each DC returns a different number of objects ranging from
>387000 down to
>88000 on the FSMO DC. Almost all of these are stale isDeleted entries.
>
>I have lowered the tombstoneLifetime setting as suggested by
>other posters
>on the lists and this appears to be slowly (very slowly) lowering the
>number of records within the ldb domaindnszones file, my hope
>is that they
>will lower sufficiently so that I can join a new working 4.1.12 DC to
>domain.
>
>I am currently attempting a Bind migration on a test DC as
>this is toted as
>a possible fix (any successes out there with this?).
>
>A matter of note for the lists: When I originally provisioned my domain
>(classic upgrade from Samba3) I created a new OU for Groups
>and moved all
>groups into it, this is a mistake if you want to migrate to Bind as the
>migration script needs CN=DnsAdmins to be in Users OU, if it isn't the
>script errors. I moved DnsAdmins back to Users to get the script to
>complete.
>
>At present I'm holding the domain together with bits of string
>and sticky
>tape - having to reboot one of my DC's every 30 mins just to
>keep things
>ticking over.
>
>I have tried many variations of joining a new DC to the domain
>but that has
>failed, so my current plan is to create a test version of my
>FSMO DC using
>BIND_DLZ (using a current snapshot of the FSMO DC) and get things to a
>working state there, and then replace this on the production site and
>re-join new DC's to rebuild things. Obviously, not best practice but I
>can't think of any other way of getting things stable again.
>
>I have tried manually editing the .ldb files but they are so
>inflated now
>that any vim edits just time out and error.
>
>Thanks,
>Chris.
>
>--
>ACS (Alavoine Computer Services Ltd)
>Chris Alavoine
>mob +44 (0)7724 710 730
>www.alavoinecs.co.uk
>http://twitter.com/#!/alavoinecs
>http://www.linkedin.com/pub/chris-alavoine/39/606/192


>--
>To unsubscribe from this list go to the following URL and read the
>instructions:  https://lists.samba.org/mailman/options/samba
>
>






-- 
ACS (Alavoine Computer Services Ltd)
Chris Alavoine
mob +44 (0)7724 710 730
www.alavoinecs.co.uk
http://twitter.com/#!/alavoinecs 
http://www.linkedin.com/pub/chris-alavoine/39/606/192 







-- 
ACS (Alavoine Computer Services Ltd)
Chris Alavoine
mob +44 (0)7724 710 730
www.alavoinecs.co.uk
http://twitter.com/#!/alavoinecs 
http://www.linkedin.com/pub/chris-alavoine/39/606/192 


More information about the samba mailing list