[Samba] Problem creating new DC's

Chris Alavoine chrisa at acs-info.co.uk
Mon Jun 9 07:34:29 MDT 2014


Hi James/all,

Understood.

The consistency check appears to run ok on both the new DC and the FSMO DC.
That is, up to the point where replication stops working and samba dies.

I see the occasional WERR_SEM_TIMEOUT errors on the more far-flung DC's but
this was happening before the replication problem.

I can still see the old DC that I removed appearing in the samba-tool drs
showrepl output on some of my other DC's, I guess they need time to update
but with my cronjob restarting samba every 30 minutes to keep things
working that's not going to happen I guesss.

I wonder if I need to remove the new Sites DC as my problems began when
this appeared in the Forest.

Thanks,
Chris.



On 9 June 2014 14:06, lp101 <lingpanda101 at gmail.com> wrote:

>  Chris,
>
>     I may not be able to offer much more help. Maybe someone on the list
> can chime in. After restarting samba can you run "/samba-tool drs kcc" on
> your DC that holds all the FSMO roles as well as your new one? Also after
> restart does "/samba-tool drs showrepl" show no errors on any DC in your
> forest?
>
> On 6/9/2014 8:06 AM, Chris Alavoine wrote:
>
> Hi James,
>
>  I may have spoken too soon. Replication to my other DC's (including the
> new one in the new Site) keeps failing. After a samba restart replication
> work for around 30 minutes and then this happens:
>
>  /usr/local/samba/bin/samba-tool drs showrepl | more
> ERROR(<class 'samba.drs_utils.drsException'>): DRS connection to
> REMOTEDC.example.com failed - drsException: DRS connection to
> REMOTEDC.example.com failed: (-1073741643, 'NT_STATUS_IO_TIMEOUT')
>   File "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/drs.py",
> line 39, in drsuapi_connect
>     (ctx.drsuapi, ctx.drsuapi_handle, ctx.bind_supported_extensions) =
> drs_utils.drsuapi_connect(ctx.server, ctx.lp, ctx.creds)
>   File "/usr/local/samba/lib/python2.7/site-packages/samba/drs_utils.py",
> line 54, in drsuapi_connect
>     raise drsException("DRS connection to %s failed: %s" % (server, e))
>
>  This has the added side-effect that Samba no longer functions (i.e. no
> longer processes logins) until I do another restart.
>
>  The only DC impervious to this behaviour the main FSMO DC.
>
>  As a workaround I have a cronjob on my other DC's that restarts samba
> every 30 minutes but clearly this is no solution.
>
>  Last Friday evening after adding my new DC in it's correct Site I
> attempt to demote the old one in this location. The samba-tool domain
> demote command failed for me so I removed the DC manually and removed all
> traces of it from DNS. Could this have caused problems?
>
>  Any help much appreciated.
>
>  Thanks,
> Chris.
>
>
>
> On 4 June 2014 14:08, Chris Alavoine <chrisa at acs-info.co.uk> wrote:
>
>> Yep, new DC shows up under ADSS.
>>
>>  c:)
>>
>>
>> On 4 June 2014 14:04, lp101 <lingpanda101 at gmail.com> wrote:
>>
>>>  Hi Chris,
>>>
>>>     Great news! Confirm Site and Services does in fact show your New DC
>>> in its appropriate location.
>>>
>>>
>>> On 6/4/2014 8:58 AM, Chris Alavoine wrote:
>>>
>>> Hi James,
>>>
>>>  Just thought I'd report my success!
>>>
>>>  I'd forgotten to specify the local DC (same Site) in my domain
>>> provision command:
>>>
>>>  /usr/local/samba/bin/samba-tool domain join example.com DC
>>> -UAdministrator --realm=example.com --server=blahdc --site=blah
>>>
>>>  This still took over an hour but didn't produce the above TIMEOUT
>>> error.
>>>
>>>  Thanks for your help on this!
>>>
>>>  c:)
>>>
>>>
>>>
>>> On 3 June 2014 16:40, Chris Alavoine <chrisa at acs-info.co.uk> wrote:
>>>
>>>> Hi James,
>>>>
>>>>  I have upped the RAM to 20GB and given it 8 cores, but unfortunately
>>>> am getting the same result. The time taken to process all the objects is
>>>> well over an hour which I'm guessing is where my problem lies.
>>>>
>>>>  Not sure what else to try expect maybe attempting to reduce the
>>>> number of DC's (over a weekend) and try again.
>>>>
>>>>  Thanks,
>>>> Chris.
>>>>
>>>>
>>>> On 3 June 2014 13:56, lp101 <lingpanda101 at gmail.com> wrote:
>>>>
>>>>>      I believe I needed at least 8GB to complete the join process. I
>>>>> know it was more then 4GB. Here is a link to my discussion I had on this
>>>>> list in Jan.
>>>>>
>>>>>
>>>>> http://samba.2283325.n4.nabble.com/DomainDnsZone-Replication-Shows-200-000-Objects-td4658437i20.html
>>>>>
>>>>>     I strongly discourage using the tombstone attribute to fix this
>>>>> issue within this discussion. It created more issues then it was worth. I'm
>>>>> not sure if this bug was fixed or not. Increase the memory and attempt to
>>>>> join the new DC to the existing DC at that site. It should help with the
>>>>> timeout error. Good luck!
>>>>>
>>>>>
>>>>>
>>>>> On 6/3/2014 8:44 AM, Chris Alavoine wrote:
>>>>>
>>>>> Hi James,
>>>>>
>>>>>  Thanks for the reply.
>>>>>
>>>>>  My last attempt had 4GB RAM and 4 cores (VM). Do you think I should
>>>>> give it some more?
>>>>>
>>>>>  Thanks,
>>>>> Chris.
>>>>>
>>>>>
>>>>> On 3 June 2014 13:42, lp101 <lingpanda101 at gmail.com> wrote:
>>>>>
>>>>>>     Hi Chris,
>>>>>>
>>>>>>     How much memory does your server have and are you attempting to
>>>>>> join it to the local DC at the site? I've had an issue similar to this and
>>>>>> increasing the server memory and attempting to join to a local DC helped.
>>>>>>
>>>>>>
>>>>>> On 6/3/2014 8:04 AM, Chris Alavoine wrote:
>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> I currently have 6 Samba4 (4.1.5) DC's spread over a global network.
>>>>>>> This
>>>>>>> is working ok but they were created before any Sites were made and
>>>>>>> as the
>>>>>>> ability to move DC's to new Sites is not working, I am attempting to
>>>>>>> create
>>>>>>> new DC's in each location and then demote the old ones.
>>>>>>>
>>>>>>> The problem I am facing is the domain join process keeps timing out
>>>>>>> for any
>>>>>>> new DC. I think this is due the amount of objects that now need to be
>>>>>>> synced:
>>>>>>>
>>>>>>> Partition[DC=DomainDnsZones,DC=essence,DC=internal,DC=com]
>>>>>>> objects[142711/162691] linked_values[0/0]
>>>>>>> Partition[DC=DomainDnsZones,DC=essence,DC=internal,DC=com]
>>>>>>> objects[143113/162691] linked_values[0/0]
>>>>>>> Partition[DC=DomainDnsZones,DC=essence,DC=internal,DC=com]
>>>>>>> objects[143515/162691] linked_values[0/0]
>>>>>>>
>>>>>>> (this is a snippet from attempting to join, as you can see there are
>>>>>>> 162691
>>>>>>> objects which takes a fair amount of time to get through - I have
>>>>>>> tried
>>>>>>> this from various different locations).
>>>>>>>
>>>>>>> This is the final error I get:
>>>>>>>
>>>>>>> Replicating DC=ForestDnsZones,DC=essence,DC=internal,DC=com
>>>>>>> Partition[DC=ForestDnsZones,DC=essence,DC=internal,DC=com]
>>>>>>> objects[24/24]
>>>>>>> linked_values[0/0]
>>>>>>> Partition[DC=ForestDnsZones,DC=essence,DC=internal,DC=com]
>>>>>>> objects[48/24]
>>>>>>> linked_values[0/0]
>>>>>>> Committing SAM database
>>>>>>> Sending DsReplicateUpdateRefs for all the replicated partitions
>>>>>>> Join failed - cleaning up
>>>>>>> checking sAMAccountName
>>>>>>> ERROR(runtime): uncaught exception - (-1073741643,
>>>>>>> 'NT_STATUS_IO_TIMEOUT')
>>>>>>>    File
>>>>>>>
>>>>>>> "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/__init__.py",
>>>>>>> line 175, in _run
>>>>>>>      return self.run(*args, **kwargs)
>>>>>>>    File
>>>>>>> "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/domain.py",
>>>>>>> line
>>>>>>> 552, in run
>>>>>>>      machinepass=machinepass, use_ntvfs=use_ntvfs,
>>>>>>> dns_backend=dns_backend)
>>>>>>>    File
>>>>>>> "/usr/local/samba/lib/python2.7/site-packages/samba/join.py", line
>>>>>>> 1172, in join_DC
>>>>>>>      ctx.do_join()
>>>>>>>    File
>>>>>>> "/usr/local/samba/lib/python2.7/site-packages/samba/join.py", line
>>>>>>> 1082, in do_join
>>>>>>>      ctx.join_finalise()
>>>>>>>    File
>>>>>>> "/usr/local/samba/lib/python2.7/site-packages/samba/join.py", line
>>>>>>> 881, in join_finalise
>>>>>>>      ctx.send_DsReplicaUpdateRefs(nc)
>>>>>>>    File
>>>>>>> "/usr/local/samba/lib/python2.7/site-packages/samba/join.py", line
>>>>>>> 866, in send_DsReplicaUpdateRefs
>>>>>>>      ctx.drsuapi.DsReplicaUpdateRefs(ctx.drsuapi_handle, 1, r)
>>>>>>>
>>>>>>>
>>>>>>> Which seem to suggest that the join fails, it tries to clean up and
>>>>>>> gets a
>>>>>>> NT_STATUS_IO_TIMEOUT error.
>>>>>>>
>>>>>>> This leaves me with a non-functioning DC appearing in the Domain
>>>>>>> Controller
>>>>>>> list on ADUC and ADSS which need to be cleaned out.
>>>>>>>
>>>>>>> Any advice on how I can get around this problem?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Chris.
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>>  -James
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> ACS (Alavoine Computer Services Ltd)
>>>>> Chris Alavoine
>>>>> mob +44 (0)7724 710 730 <%2B44%20%280%297724%20710%20730>
>>>>> www.alavoinecs.co.uk
>>>>> http://twitter.com/#!/alavoinecs
>>>>> http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>>>
>>>>>
>>>>>   --
>>>>> -James
>>>>>
>>>>>
>>>>
>>>>
>>>>  --
>>>> ACS (Alavoine Computer Services Ltd)
>>>> Chris Alavoine
>>>> mob +44 (0)7724 710 730 <%2B44%20%280%297724%20710%20730>
>>>> www.alavoinecs.co.uk
>>>> http://twitter.com/#!/alavoinecs
>>>> http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>>
>>>
>>>
>>>
>>>  --
>>> ACS (Alavoine Computer Services Ltd)
>>> Chris Alavoine
>>> mob +44 (0)7724 710 730 <%2B44%20%280%297724%20710%20730>
>>> www.alavoinecs.co.uk
>>> http://twitter.com/#!/alavoinecs
>>> http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>>
>>>
>>>   --
>>> -James
>>>
>>>
>>
>>
>>  --
>> ACS (Alavoine Computer Services Ltd)
>> Chris Alavoine
>> mob +44 (0)7724 710 730 <%2B44%20%280%297724%20710%20730>
>> www.alavoinecs.co.uk
>> http://twitter.com/#!/alavoinecs
>> http://www.linkedin.com/pub/chris-alavoine/39/606/192
>>
>
>
>
>  --
> ACS (Alavoine Computer Services Ltd)
> Chris Alavoine
> mob +44 (0)7724 710 730
> www.alavoinecs.co.uk
> http://twitter.com/#!/alavoinecs
> http://www.linkedin.com/pub/chris-alavoine/39/606/192
>
>
> --
> -James
>
>


-- 
ACS (Alavoine Computer Services Ltd)
Chris Alavoine
mob +44 (0)7724 710 730
www.alavoinecs.co.uk
http://twitter.com/#!/alavoinecs
http://www.linkedin.com/pub/chris-alavoine/39/606/192


More information about the samba mailing list