[Samba] Problem creating new DC's

lp101 lingpanda101 at gmail.com
Mon Jun 9 10:57:30 MDT 2014


Chris,

     I would not delete those files and attempt to rejoin. It will cause 
problems(First hand knowledge). You have to attempt demote and rejoin. 
As you can see. Demote doesn't work 100% of the time. You can always run 
the KCC on the DC that isn't replicating properly. It should 
automatically include the entries it's missing. As a backup you can 
create the links yourself in Sites and Services. Again it shouldn't be 
necessary as the KCC should be doing this for you. Can you post exaclty 
what you have and what you believe is missing? Hide any personal info.  
Another shot is to try and enable replication through "/samba-tool drs 
options". See the help for syntax. I've never done this so no idea if it 
will be successful.

On 6/9/2014 12:46 PM, Chris Alavoine wrote:
> Hi James,
>
> Thanks, this is great info. Have followed these steps. It looks like 
> one of my DC's is getting the incorrect replication data as there are 
> some things missing from it's CN=Sites section that I created a few 
> weeks ago.
>
> I'm not sure why this should be, but could I attempt to rejoin this DC 
> to the domain? (i.e. rm -rf /usr/local/samba/etc/smb.conf 
> /usr/local/samba/private/* and the domain provision again) or would 
> that cause problems than it's worth?
>
> I'm not out of the woods yet, but I think that replication appears to 
> be staying alive for longer now and so no longer crashing samba which 
> is a big step in the right direction. Also, the speed that "samba-tool 
> drs showrepl" appears is much quicker.
>
> c:)
>
>
>
> On 9 June 2014 16:19, lp101 <lingpanda101 at gmail.com 
> <mailto:lingpanda101 at gmail.com>> wrote:
>
>     Chris,
>
>         Sounds like you have a few issues. I would try and clean up
>     your demoted DC first and work to resolve the new site DC. In my
>     experience the demote tool has never succeeded. I've always had to
>     go into ADUC, Sites and Services, DNS and ADSI edit to clean up
>     demoted DC's that did not remove gracefully. I followed Microsoft
>     suggestions for removing a dead or offline server. It was almost
>     immediate for me that the dead server no longer displayed when
>     running "/samba-tool drs showrepl". I did the following from this
>     blog http://www.petri.co.il/delete_failed_dcs_from_ad.htm
>
>      1. Open Active Directory Sites and Services and expand the
>         appropriate site. Delete the dead server.
>      2. http://www.petri.co.il/images/cleanup1.gif
>      3. Open Active Directory Users and Computers. Expand the Domain
>         Controllers container.
>      4. http://www.petri.co.il/images/cleanup2.gif
>      5. Delete the server object associated with the failed domain
>         controller.
>      6. Windows Server 2003 AD might display a new type of question
>         window, asking you if you want to delete the server object
>         without performing a DCPROMO operation (which, of course, you
>         cannot perform, otherwise you wouldn't be reading this
>         article, would you...) Select "This DC is permanently
>         offline..." and click on the Delete button.
>      7. http://www.petri.co.il/images/cleanup3.gif
>     21. AD will display another confirmation window. If you're sure
>         that you want to delete the failed object, click Yes.
>     21. http://www.petri.co.il/images/cleanup4.gif
>     21. Next Remove the server from DNS.
>     22. In the DNS snap-in, expand the zone that is related to the
>         domain from where the server has been removed.
>     23. Remove the CNAME record in the _msdcs.root domain of forest
>         zone in DNS. You should also delete the HOSTNAME and other DNS
>         records.
>     21. http://www.petri.co.il/images/cleanup5.gif
>     21. If you have reverse lookup zones, also remove the server from
>         these zones.
>     21. Use "ADSIEdit" to remove old computer records from the Active
>         Directory: Default Naming Context
>
>         a. OU=Domain Controllers,DC=domain,DC=local
>
>         b.
>         CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=domain,DC=local
>
>         c. CN=Domain System Volume (SYSVOL share),CN=File Replication
>         Service,CN=System,DC=domain,DC=local
>     21. Use ADSI edit to remove NTDS settings: Configuration
>     21. CN=Sites
>     21. CN=Default-First-Site-Name (Your Site Name May Be different)
>     21. CN=Servers
>     21. CN=SAMBADC1 (Go through each server that has a connection to
>         the offline DC)
>     21. CN=NTDS Settings
>
>     Remove all NTDS settings that reference the offline or dead DC. If
>     you accidentally delete the wrong NTDS setting. Run samba-tool drs
>     kcc to try and recreate. Be careful. Make sure you connect to the
>     DC that holds all the FSMO roles. In my case the first DC I
>     provisioned. It make take a few minutes for the changes to
>     replicate across the forest. I made sure to log into each DC with
>     all these tools to verify the changes replicated correctly. Use
>     caution when using ADSI.  Good luck and remember to take a backup.
>
>
>     On 6/9/2014 9:34 AM, Chris Alavoine wrote:
>>     Hi James/all,
>>
>>     Understood.
>>
>>     The consistency check appears to run ok on both the new DC and
>>     the FSMO DC. That is, up to the point where replication stops
>>     working and samba dies.
>>
>>     I see the occasional WERR_SEM_TIMEOUT errors on the more
>>     far-flung DC's but this was happening before the replication problem.
>>
>>     I can still see the old DC that I removed appearing in the
>>     samba-tool drs showrepl output on some of my other DC's, I guess
>>     they need time to update but with my cronjob restarting samba
>>     every 30 minutes to keep things working that's not going to
>>     happen I guesss.
>>
>>     I wonder if I need to remove the new Sites DC as my problems
>>     began when this appeared in the Forest.
>>
>>     Thanks,
>>     Chris.
>>
>>
>>
>>     On 9 June 2014 14:06, lp101 <lingpanda101 at gmail.com
>>     <mailto:lingpanda101 at gmail.com>> wrote:
>>
>>         Chris,
>>
>>             I may not be able to offer much more help. Maybe someone
>>         on the list can chime in. After restarting samba can you run
>>         "/samba-tool drs kcc" on your DC that holds all the FSMO
>>         roles as well as your new one? Also after restart does
>>         "/samba-tool drs showrepl" show no errors on any DC in your
>>         forest?
>>
>>         On 6/9/2014 8:06 AM, Chris Alavoine wrote:
>>>         Hi James,
>>>
>>>         I may have spoken too soon. Replication to my other DC's
>>>         (including the new one in the new Site) keeps failing. After
>>>         a samba restart replication work for around 30 minutes and
>>>         then this happens:
>>>
>>>         /usr/local/samba/bin/samba-tool drs showrepl | more
>>>         ERROR(<class 'samba.drs_utils.drsException'>): DRS
>>>         connection to REMOTEDC.example.com
>>>         <http://REMOTEDC.example.com> failed - drsException: DRS
>>>         connection to REMOTEDC.example.com
>>>         <http://REMOTEDC.example.com> failed: (-1073741643,
>>>         'NT_STATUS_IO_TIMEOUT')
>>>           File
>>>         "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/drs.py",
>>>         line 39, in drsuapi_connect
>>>             (ctx.drsuapi, ctx.drsuapi_handle,
>>>         ctx.bind_supported_extensions) =
>>>         drs_utils.drsuapi_connect(ctx.server, ctx.lp, ctx.creds)
>>>           File
>>>         "/usr/local/samba/lib/python2.7/site-packages/samba/drs_utils.py",
>>>         line 54, in drsuapi_connect
>>>             raise drsException("DRS connection to %s failed: %s" %
>>>         (server, e))
>>>
>>>         This has the added side-effect that Samba no longer
>>>         functions (i.e. no longer processes logins) until I do
>>>         another restart.
>>>
>>>         The only DC impervious to this behaviour the main FSMO DC.
>>>
>>>         As a workaround I have a cronjob on my other DC's that
>>>         restarts samba every 30 minutes but clearly this is no solution.
>>>
>>>         Last Friday evening after adding my new DC in it's correct
>>>         Site I attempt to demote the old one in this location. The
>>>         samba-tool domain demote command failed for me so I removed
>>>         the DC manually and removed all traces of it from DNS. Could
>>>         this have caused problems?
>>>
>>>         Any help much appreciated.
>>>
>>>         Thanks,
>>>         Chris.
>>>
>>>
>>>
>>>         On 4 June 2014 14:08, Chris Alavoine <chrisa at acs-info.co.uk
>>>         <mailto:chrisa at acs-info.co.uk>> wrote:
>>>
>>>             Yep, new DC shows up under ADSS.
>>>
>>>             c:)
>>>
>>>
>>>             On 4 June 2014 14:04, lp101 <lingpanda101 at gmail.com
>>>             <mailto:lingpanda101 at gmail.com>> wrote:
>>>
>>>                 Hi Chris,
>>>
>>>                     Great news! Confirm Site and Services does in
>>>                 fact show your New DC in its appropriate location.
>>>
>>>
>>>                 On 6/4/2014 8:58 AM, Chris Alavoine wrote:
>>>>                 Hi James,
>>>>
>>>>                 Just thought I'd report my success!
>>>>
>>>>                 I'd forgotten to specify the local DC (same Site)
>>>>                 in my domain provision command:
>>>>
>>>>                 /usr/local/samba/bin/samba-tool domain join
>>>>                 example.com <http://example.com> DC -UAdministrator
>>>>                 --realm=example.com <http://example.com>
>>>>                 --server=blahdc --site=blah
>>>>
>>>>                 This still took over an hour but didn't produce the
>>>>                 above TIMEOUT error.
>>>>
>>>>                 Thanks for your help on this!
>>>>
>>>>                 c:)
>>>>
>>>>
>>>>
>>>>                 On 3 June 2014 16:40, Chris Alavoine
>>>>                 <chrisa at acs-info.co.uk
>>>>                 <mailto:chrisa at acs-info.co.uk>> wrote:
>>>>
>>>>                     Hi James,
>>>>
>>>>                     I have upped the RAM to 20GB and given it 8
>>>>                     cores, but unfortunately am getting the same
>>>>                     result. The time taken to process all the
>>>>                     objects is well over an hour which I'm guessing
>>>>                     is where my problem lies.
>>>>
>>>>                     Not sure what else to try expect maybe
>>>>                     attempting to reduce the number of DC's (over a
>>>>                     weekend) and try again.
>>>>
>>>>                     Thanks,
>>>>                     Chris.
>>>>
>>>>
>>>>                     On 3 June 2014 13:56, lp101
>>>>                     <lingpanda101 at gmail.com
>>>>                     <mailto:lingpanda101 at gmail.com>> wrote:
>>>>
>>>>                             I believe I needed at least 8GB to
>>>>                         complete the join process. I know it was
>>>>                         more then 4GB. Here is a link to my
>>>>                         discussion I had on this list in Jan.
>>>>
>>>>                         http://samba.2283325.n4.nabble.com/DomainDnsZone-Replication-Shows-200-000-Objects-td4658437i20.html
>>>>
>>>>                             I strongly discourage using the
>>>>                         tombstone attribute to fix this issue
>>>>                         within this discussion. It created more
>>>>                         issues then it was worth. I'm not sure if
>>>>                         this bug was fixed or not. Increase the
>>>>                         memory and attempt to join the new DC to
>>>>                         the existing DC at that site. It should
>>>>                         help with the timeout error. Good luck!
>>>>
>>>>
>>>>
>>>>                         On 6/3/2014 8:44 AM, Chris Alavoine wrote:
>>>>>                         Hi James,
>>>>>
>>>>>                         Thanks for the reply.
>>>>>
>>>>>                         My last attempt had 4GB RAM and 4 cores
>>>>>                         (VM). Do you think I should give it some more?
>>>>>
>>>>>                         Thanks,
>>>>>                         Chris.
>>>>>
>>>>>
>>>>>                         On 3 June 2014 13:42, lp101
>>>>>                         <lingpanda101 at gmail.com
>>>>>                         <mailto:lingpanda101 at gmail.com>> wrote:
>>>>>
>>>>>                                 Hi Chris,
>>>>>
>>>>>                                 How much memory does your server
>>>>>                             have and are you attempting to join it
>>>>>                             to the local DC at the site? I've had
>>>>>                             an issue similar to this and
>>>>>                             increasing the server memory and
>>>>>                             attempting to join to a local DC helped.
>>>>>
>>>>>
>>>>>                             On 6/3/2014 8:04 AM, Chris Alavoine wrote:
>>>>>
>>>>>                                 Hi there,
>>>>>
>>>>>                                 I currently have 6 Samba4 (4.1.5)
>>>>>                                 DC's spread over a global network.
>>>>>                                 This
>>>>>                                 is working ok but they were
>>>>>                                 created before any Sites were made
>>>>>                                 and as the
>>>>>                                 ability to move DC's to new Sites
>>>>>                                 is not working, I am attempting to
>>>>>                                 create
>>>>>                                 new DC's in each location and then
>>>>>                                 demote the old ones.
>>>>>
>>>>>                                 The problem I am facing is the
>>>>>                                 domain join process keeps timing
>>>>>                                 out for any
>>>>>                                 new DC. I think this is due the
>>>>>                                 amount of objects that now need to be
>>>>>                                 synced:
>>>>>
>>>>>                                 Partition[DC=DomainDnsZones,DC=essence,DC=internal,DC=com]
>>>>>                                 objects[142711/162691]
>>>>>                                 linked_values[0/0]
>>>>>                                 Partition[DC=DomainDnsZones,DC=essence,DC=internal,DC=com]
>>>>>                                 objects[143113/162691]
>>>>>                                 linked_values[0/0]
>>>>>                                 Partition[DC=DomainDnsZones,DC=essence,DC=internal,DC=com]
>>>>>                                 objects[143515/162691]
>>>>>                                 linked_values[0/0]
>>>>>
>>>>>                                 (this is a snippet from attempting
>>>>>                                 to join, as you can see there are
>>>>>                                 162691
>>>>>                                 objects which takes a fair amount
>>>>>                                 of time to get through - I have tried
>>>>>                                 this from various different
>>>>>                                 locations).
>>>>>
>>>>>                                 This is the final error I get:
>>>>>
>>>>>                                 Replicating
>>>>>                                 DC=ForestDnsZones,DC=essence,DC=internal,DC=com
>>>>>                                 Partition[DC=ForestDnsZones,DC=essence,DC=internal,DC=com]
>>>>>                                 objects[24/24]
>>>>>                                 linked_values[0/0]
>>>>>                                 Partition[DC=ForestDnsZones,DC=essence,DC=internal,DC=com]
>>>>>                                 objects[48/24]
>>>>>                                 linked_values[0/0]
>>>>>                                 Committing SAM database
>>>>>                                 Sending DsReplicateUpdateRefs for
>>>>>                                 all the replicated partitions
>>>>>                                 Join failed - cleaning up
>>>>>                                 checking sAMAccountName
>>>>>                                 ERROR(runtime): uncaught exception
>>>>>                                 - (-1073741643,
>>>>>                                 'NT_STATUS_IO_TIMEOUT')
>>>>>                                    File
>>>>>                                 "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/__init__.py",
>>>>>                                 line 175, in _run
>>>>>                                      return self.run(*args, **kwargs)
>>>>>                                    File
>>>>>                                 "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/domain.py",
>>>>>                                 line
>>>>>                                 552, in run
>>>>>                                  machinepass=machinepass,
>>>>>                                 use_ntvfs=use_ntvfs,
>>>>>                                 dns_backend=dns_backend)
>>>>>                                    File
>>>>>                                 "/usr/local/samba/lib/python2.7/site-packages/samba/join.py",
>>>>>                                 line
>>>>>                                 1172, in join_DC
>>>>>                                  ctx.do_join()
>>>>>                                    File
>>>>>                                 "/usr/local/samba/lib/python2.7/site-packages/samba/join.py",
>>>>>                                 line
>>>>>                                 1082, in do_join
>>>>>                                  ctx.join_finalise()
>>>>>                                    File
>>>>>                                 "/usr/local/samba/lib/python2.7/site-packages/samba/join.py",
>>>>>                                 line
>>>>>                                 881, in join_finalise
>>>>>                                  ctx.send_DsReplicaUpdateRefs(nc)
>>>>>                                    File
>>>>>                                 "/usr/local/samba/lib/python2.7/site-packages/samba/join.py",
>>>>>                                 line
>>>>>                                 866, in send_DsReplicaUpdateRefs
>>>>>                                  ctx.drsuapi.DsReplicaUpdateRefs(ctx.drsuapi_handle,
>>>>>                                 1, r)
>>>>>
>>>>>
>>>>>                                 Which seem to suggest that the
>>>>>                                 join fails, it tries to clean up
>>>>>                                 and gets a
>>>>>                                 NT_STATUS_IO_TIMEOUT error.
>>>>>
>>>>>                                 This leaves me with a
>>>>>                                 non-functioning DC appearing in
>>>>>                                 the Domain Controller
>>>>>                                 list on ADUC and ADSS which need
>>>>>                                 to be cleaned out.
>>>>>
>>>>>                                 Any advice on how I can get around
>>>>>                                 this problem?
>>>>>
>>>>>                                 Thanks
>>>>>                                 Chris.
>>>>>
>>>>>
>>>>>                             -- 
>>>>>                             -James
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                         -- 
>>>>>                         ACS (Alavoine Computer Services Ltd)
>>>>>                         Chris Alavoine
>>>>>                         mob +44 (0)7724 710 730
>>>>>                         <tel:%2B44%20%280%297724%20710%20730>
>>>>>                         www.alavoinecs.co.uk
>>>>>                         <http://www.alavoinecs.co.uk>
>>>>>                         http://twitter.com/#!/alavoinecs
>>>>>                         <http://twitter.com/#%21/alavoinecs>
>>>>>                         http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>>>
>>>>
>>>>                         -- 
>>>>                         -James
>>>>
>>>>
>>>>
>>>>
>>>>                     -- 
>>>>                     ACS (Alavoine Computer Services Ltd)
>>>>                     Chris Alavoine
>>>>                     mob +44 (0)7724 710 730
>>>>                     <tel:%2B44%20%280%297724%20710%20730>
>>>>                     www.alavoinecs.co.uk <http://www.alavoinecs.co.uk>
>>>>                     http://twitter.com/#!/alavoinecs
>>>>                     <http://twitter.com/#%21/alavoinecs>
>>>>                     http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                 -- 
>>>>                 ACS (Alavoine Computer Services Ltd)
>>>>                 Chris Alavoine
>>>>                 mob +44 (0)7724 710 730
>>>>                 <tel:%2B44%20%280%297724%20710%20730>
>>>>                 www.alavoinecs.co.uk <http://www.alavoinecs.co.uk>
>>>>                 http://twitter.com/#!/alavoinecs
>>>>                 <http://twitter.com/#%21/alavoinecs>
>>>>                 http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>
>>>                 -- 
>>>                 -James
>>>
>>>
>>>
>>>
>>>             -- 
>>>             ACS (Alavoine Computer Services Ltd)
>>>             Chris Alavoine
>>>             mob +44 (0)7724 710 730
>>>             <tel:%2B44%20%280%297724%20710%20730>
>>>             www.alavoinecs.co.uk <http://www.alavoinecs.co.uk>
>>>             http://twitter.com/#!/alavoinecs
>>>             <http://twitter.com/#%21/alavoinecs>
>>>             http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>
>>>
>>>
>>>
>>>         -- 
>>>         ACS (Alavoine Computer Services Ltd)
>>>         Chris Alavoine
>>>         mob +44 (0)7724 710 730 <tel:%2B44%20%280%297724%20710%20730>
>>>         www.alavoinecs.co.uk <http://www.alavoinecs.co.uk>
>>>         http://twitter.com/#!/alavoinecs
>>>         <http://twitter.com/#%21/alavoinecs>
>>>         http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>
>>         -- 
>>         -James
>>
>>
>>
>>
>>     -- 
>>     ACS (Alavoine Computer Services Ltd)
>>     Chris Alavoine
>>     mob +44 (0)7724 710 730 <tel:%2B44%20%280%297724%20710%20730>
>>     www.alavoinecs.co.uk <http://www.alavoinecs.co.uk>
>>     http://twitter.com/#!/alavoinecs
>>     <http://twitter.com/#%21/alavoinecs>
>>     http://www.linkedin.com/pub/chris-alavoine/39/606/192
>
>     -- 
>     -James
>
>
>
>
> -- 
> ACS (Alavoine Computer Services Ltd)
> Chris Alavoine
> mob +44 (0)7724 710 730
> www.alavoinecs.co.uk <http://www.alavoinecs.co.uk>
> http://twitter.com/#!/alavoinecs <http://twitter.com/#%21/alavoinecs>
> http://www.linkedin.com/pub/chris-alavoine/39/606/192

-- 
-James



More information about the samba mailing list