[Samba] Replication Failure Issue

David Minard david at scem.westernsydney.edu.au
Mon Mar 26 23:00:06 UTC 2018



On 26/03/18 23:13, lingpanda101 wrote:
> On 3/25/2018 8:54 PM, David Minard wrote:
>>
>>
>> On 24/03/18 01:35, lingpanda101 wrote:
>>> On 3/22/2018 8:06 PM, David Minard wrote:
>>>> G'day All,
>>>>
>>>>     Will replay to all messages so far in this one to keep it all 
>>>> together.
>>>>
>>>> On 21/03/18 22:52, lingpanda101 wrote:
>>>>> On 3/21/2018 7:32 AM, David Minard via samba wrote:
>>>>>> Thanks Carlos,
>>>>>>
>>>>>> The thing is, that I did not upgrade the version of Samba - that 
>>>>>> is the next step, so the ports used would not have changed. I only 
>>>>>> updated the OS.
>>>>>>
>>>>>>
>>>>>>> On 21/03/2018, at 10:04 PM, Carlos Alberto Panozzo Cunha 
>>>>>>> <carlos.hollow at gmail.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>> I have same problem after update for samba.
>>>>>>> I allow new ports in firewall.
>>>>>>>
>>>>>>> https://wiki.samba.org/index.php/Samba_AD_DC_Port_Usage
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 21, 2018, 00:15 David Minard via samba 
>>>>>>> <samba at lists.samba.org> wrote:
>>>>>>> G'day All,
>>>>>>>
>>>>>>>          I have 4 DCs on Centos 7.1. Everything was working 
>>>>>>> really well for
>>>>>>> years, including replication.
>>>>>>>
>>>>>>>          Then I decided that the OS needed updating. Did the yum 
>>>>>>> update on one
>>>>>>> of the DCs, rebooted. That server is now running Centos 7.4. Samba
>>>>>>> seemed to start okay.
>>>>>>>
>>>>>>>          However, samba-tool drs showrepl gives this error on all 
>>>>>>> 3 of the other
>>>>>>> DCs, and shows success on the updated DC.
>>>>>>>
>>>>>>> DC=DomainDnsZones,DC=samba4,DC=scem,DC=westernsydney,DC=edu,DC=au
>>>>>>>          Default-First-Site-Name\SAMBA4-10 via RPC
>>>>>>>                  DSA object GUID: 
>>>>>>> 7fa7fc88-8d99-4217-b329-7e82324ec084
>>>>>>>
>>>>>>>                  Last attempt @ Wed Mar 21 12:58:13 2018 AEDT 
>>>>>>> failed, result 58
>>>>>>> (WERR_BAD_NET_RESP)
>>>>>>>
>>>>>>>                  10623 consecutive failure(s).
>>>>>>>                  Last success @ Thu Mar  8 14:34:14 2018 AEDT
>>>>>>>
>>>>>>>
>>>>>>>          Any thoughts on why this DC is now not replicating 
>>>>>>> properly? Any
>>>>>>> thoughts on how to remedy this?
>>>>>>>
>>>>>>>
>>>>
>>>>>>
>>>>> You most likely will need to turn up the samba log level to get 
>>>>> additional information but you can start with running 'yum history 
>>>>> list all' and post results. This might help identify the changes 
>>>>> that were made to the OS. Are you using bind or the internal DNS?
>>>>>
>>>>>
>>>>
>>>> I will turn up the logs and test it out.
>>>>
>>>> I use Bind-9.9.4-51 (before update 9.9.4-18)
>>>>
>>>> yum history shows 348 packages that got updated... Bind being one. 
>>>> Will sift through them.
>>>>
>>>> My firewall is very lose. All ports are open for the subnets on 
>>>> which the samba servers need to talk. eg:
>>>>
>>>> -A INPUT -s 172.20.0.0/16 -p tcp -m state --state NEW -m tcp -j ACCEPT
>>>> -A INPUT -s 172.20.0.0/16 -p udp -m state --state NEW -m udp -j ACCEPT
>>>>
>>>> When I first set this up with 4.0.0-a2 (or whatever it was right at 
>>>> the beginning), I was not able to work out what ports exactly were 
>>>> needed, hence the lose rules. Now I see they are documented clearly 
>>>> on the Samba site, I will tighten them up, but not until the issue 
>>>> is resolved.
>>>>
>>>> My samba is complied from source. I am currently running 4.3.2. It's 
>>>> been running flawlessly so no urgency to update, until the huge 
>>>> security hole was announced the other week. Now I've got to get it 
>>>> done, but want the ailing server going right first - or should I 
>>>> just do the updates and then worry about the ailing server?
>>>>
>>>> Smb.conf:
>>>>
>>>> # Global parameters
>>>> [global]
>>>>     workgroup = SCEM_AD
>>>>     realm = samba4.scem.westernsydney.edu.au
>>>>     netbios name = SAMBA4-10
>>>>     server role = active directory domain controller
>>>>     server services = s3fs, rpc, nbt, wrepl, ldap, cldap, kdc, 
>>>> drepl, winbindd, ntp_signd, kcc, dnsupdate
>>>>
>>>> #        log level = 1 auth:2
>>>> # logs split per machine
>>>>         log file = /var/log/samba/log.%m
>>>>         # max 50KB per log file, then rotate
>>>>         max log size = 0
>>>>
>>>> [netlogon]
>>>>     path = 
>>>> /usr/local/samba/var/locks/sysvol/samba4.scem.westernsydney.edu.au/scripts 
>>>>
>>>>     read only = No
>>>>
>>>> [sysvol]
>>>>     path = /usr/local/samba/var/locks/sysvol
>>>>     read only = No
>>>>
>>>>
>>>> It is the out of the box config from the original provision.
>>>>
>>>>
>>> I myself would hold off updating until you correct the DC's with the 
>>> issues. Anything in the Samba logs or yum history stand out? You can 
>>> try and force replication 'samba-tool drs replicate --full-sync' from 
>>> FirstDC to SecondDC.
>>>
>>
>> The first thing I tried, was the forced replication on NC that was 
>> unhappy:
>>
>> # samba-tool drs replicate Broken-DC Working-DC 
>> DC=DomainDnsZones,DC=samba4,DC=scem,DC=westernsydney,DC=edu,DC=au 
>> --full-sync
>> Replicate from Working-DC to Broken-DC was successful.
>>
>> Then doing the showrepl on all DCs, everything seemed fine.
>>
>>
>> I held off sending this message for a couple of hours, and things are 
>> now showing up as broken again. I now have two DCs with the same 
>> issue, because I accidentally got the direction of the sync wrong. I 
>> went source destination, rather than destination source. I should read 
>> the help a bit better!
>>
>> Anyway, this shows that manual replication seems successful, and that 
>> it might not be a firewall thing, as the second DC that now has the 
>> issue has not been updated in any way, shape, or form.
>>
>> Now the strangest thing is that the two broken-DCs report that 
>> everything is fine between them when I showrepl. From the working-DCs, 
>> they show the two broken-DCs up.
>>
>>
>>
>>
>>
> Before you try anything further I would suggest you make a good backup 
> of your current DC not exhibiting any replication issues.
> 
> https://wiki.samba.org/index.php/Back_up_and_Restoring_a_Samba_AD_DC

	Right oh. Will get onto that.
> 
> Have you tried correcting the force replication with a known good DC?
> 

	Yes. The replication says it is successful. When I showrepl from the 
good DC, the issue shows up again.

	If I do a showrepl on one of the bad DCs, it show all DCs to be okay.

> You can try to further troubleshoot the issues and attempt to resolve, 
> but the easiest thing IMO would be to join new DC's to the domain. 
> Remove the other two DC's from the domain and never join them again.

	I will look into that. We have one site with no DC, so this would be a 
good opportunity to introduce one. If that one is okay, then I can drop 
the broken ones and set up new ones as you suggest.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.




More information about the samba mailing list