[Samba] Samba Join as DC failed

Donaldson Jeff Jeff.Donaldson at ncs.k12.de.us
Fri Oct 18 10:48:26 MDT 2013


Andrew,

The number of records indicated in the last email was based on these lines that were returned during the failed samba join. This is the last line of that sequence.

Partition[DC=DomainDnsZones,DC=ncs,DC=k12,DC=de,DC=us] objects[94443/94443] linked_values[0/0]

I think we're probably closer to 2,200 objects, so I apologize for any confusion. I also ran the ldbsearch you requested. Here's the output...

root at ncssamba1:~# ldbsearch --show-deleted -H /usr/local/samba/private/sam.ldb -s base -b 'CN=test_user,CN=Deleted Objects,DC=ncs,DC=k12,DC=de,DC=us'
search error - No such Base DN: CN=test_user,CN=Deleted Objects,DC=ncs,DC=k12,DC=de,DC=us

This account was producing errors in the log on this server. Since this was an old account no longer used, I used ADUC on a 2008 R2 server to delete the user account. I thought at the time this would eliminate the errors in the log. I believe it instead created an orphaned record and I'm not sure how to go about getting it removed cleanly.

As Dave mentioned in previous email, we installed the 4.2 Alpha using git accidentally. Here's the output from samba -V, if it helps.

root at ncssamba1:~# samba -V
Version 4.2.0pre1-GIT-b505111

We are now trying to get two servers running the 4.1.0 stable release up and running to eventually phase out the 4.2 servers. Here's the output from samba -V from server trying to join the domain unsuccessfully.

root at ncsauth2:~# samba -V
Version 4.1.0

Please let me know if there is anything else you need to help us troubleshoot the problem. We are truly grateful for your support!

Regards,
Jeff

Jeff Donaldson
Technology Director
Newark Charter School
jeff.donaldson at ncs.k12.de.us
(302) 369-2001 ext: 425

________________________________
From: David Hopkins <dahopkins429 at gmail.com>
Sent: Friday, October 18, 2013 10:58 AM
To: Andrew Bartlett
Cc: Donaldson Jeff; samba at lists.samba.org; O'Neill James
Subject: Re: [Samba] Samba Join as DC failed

Jeff,

My response for the history of the domain:

Our prior authentication system was based on a custom Samba3+Openldap solution (originally developed with people from the K12LTSP list).  This authentication system had been very stable for 10+ years. We installed the latest version of Samba using git (perhaps unfortunately, because we pulled 4.2) and upgraded to Samba4 to provide better support for Windows 7 and Server 2008/2012 systems. We then installed a second authentication server using the same process and joined that server to the domain as a second DC (also pulled the 4.2 version).   Authentication was working very well until recently when the first server began to randomly stop responding to dns requests.  We decided to install Samba 4.1 (in an effort to move back to the stable version). It was on trying to join the Samba 4.1 server to the domain as an AD DC that we got the above issue.  We have two zones in DNS (10.179.0.0 and 10.186.0.0, subnet mask 255.255.224.0)  The server with DNS issues is in the 10.179.0.0 zone. The other server is working properly.  Replication seems to be working properly.

As for the size of the domain, did I misread the screen? The number reported is the number that was returned during the join operation.


Sincerely,
Dave


On Thu, Oct 17, 2013 at 10:57 PM, Andrew Bartlett <abartlet at samba.org<mailto:abartlet at samba.org>> wrote:
On Thu, 2013-10-17 at 12:50 +0000, Donaldson Jeff wrote:
> Attempted to join domain via
>
> ./bin/samba-tool domain join ncs.k12.de.us<http://ncs.k12.de.us><http://ncs.k12.de.us> DC -Uadministrator --realm=ncs.k12.de.us<http://ncs.k12.de.us><http://ncs.k12.de.us>
>
> But this failed with
>
> Committing SAM database
> Failed to apply linked attribute change 'attribute 'isRecycled': invalid modify flags on 'CN=test_user,CN=Deleted Objects,DC=ncs,DC=k12,DC=de,DC=us': 0x0'
> dn: <GUID=4d560497-5f00-4d97-96a0-47ae1799ba92>;<SID=S-1-5-21-276688905-1455118844-2751846679-67110292>;CN=test_user,CN=Deleted Objects,DC=ncs,DC=k12,DC=de,DC=us
>
> Join failed - cleaning up
> checking sAMAccountName
> ERROR(ldb): uncaught exception - attribute 'isRecycled': invalid modify flags on 'CN=test_user,CN=Deleted Objects,DC=ncs,DC=k12,DC=de,DC=us': 0x0
>   File "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/__init__.py", line 175, in _run
>     return self.run(*args, **kwargs)
>   File "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/domain.py", line 552, in run
>     machinepass=machinepass, use_ntvfs=use_ntvfs, dns_backend=dns_backend)
>   File "/usr/local/samba/lib/python2.7/site-packages/samba/join.py", line 1169, in join_DC
>     ctx.do_join()
>   File "/usr/local/samba/lib/python2.7/site-packages/samba/join.py", line 1074, in do_join
>     ctx.join_replicate()
>   File "/usr/local/samba/lib/python2.7/site-packages/samba/join.py", line 848, in join_replicate
>     ctx.local_samdb.transaction_commit()
>
> As suggestion found here https://irclog.samba.org/2013/09/20130908-Sun.log:  is to use
>
> ldbedit -H /usr/local/samba/private/sam.ldb --show-deleted
> '(isDeleted=*)'

This is not good advise for the general case.  Deleting the objects
manually breaks replication (because the purpose of the deleted object
is to replicate the fact that it is deleted!), and should be a last
resort.

> to manually delete all the accounts with this attribute. When doing
> this I should stop samba on all DCs and then edit the local sam.ldb on
> each. Then restart samba on the DC and re-try joining the domain after
> deleting all files /usr/local/samba/private on the DC I am attempting
> to join to the domain as a DC?
>
> Also saw on Samba list Nikos Mita had similar issue. It was suggested
> to try using samba-tool dbcheck -fix. Should I try this first? I'm
> just concerned whether this would complete or not. I have 94,443
> records and this server only has 8GB of memory.
>
> I want to make certain I get the sequence correct.
>
> Also, before doing any of the above, I will make a copy of the private
> directories on the DC just in case ...
>
> Any help is appreciated. Thanks!

G'Day,

It seems to be the week for very, very large Samba installations!

I've looked at the code, and I know the line that fails, but don't I
know why this happens.  Can you show me the failing object with
ldbsearch?

ldbsearch --show-deleted -H /usr/local/samba/private/sam.ldb -s base -b
'CN=test_user,CN=Deleted Objects,DC=ncs,DC=k12,DC=de,DC=us'

The thing is, an object that has isRecycled set on it should not be able
to get to the line of code that fails, so I'm quite puzzled.  I can fix
the 'error' simply (just need to create a new blank modification, rather
than re-using a search result), but I first want to know why it is
wrong.

Can you also let me know the full history of this domain?  A user that
is deleted should have a name with "DEL" and a GUID in it.

The second part, once I have that is working out why our tests didn't
cover this code path, and working out how to make them do that.

But while you won't need to run dbcheck now, you will at some point in
the future.  What we clearly do need is for a few of our very large
installations to club together and work on/isolate the remaining issues
at the scale you have.

Thank you so much for taking Samba to the extreme, and I will do what I
can to best assist you.

Andrew Bartlett

--
Andrew Bartlett
http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org
Samba Developer, Catalyst IT                   http://catalyst.net.nz





More information about the samba mailing list