[Samba] Samba 4 AD - Samba Fails to Start, hdb_samba4_create_kdc (setup KDC database) failed

Rowland penny rpenny at samba.org
Sun Jan 3 08:37:37 UTC 2016

On 03/01/16 06:00, JS wrote:
>   <=?windows-1252?Q?L.P.H._van_Belle?=> writes:
>> Ok,
> Hi Louis,
> Thank you again for taking the time to help me out, I do appreciate it, and
> I hope you had a safe and Happy New Year's eve.  I'm going to work my way
> through the questions/comments in your response from top to bottom:
>> First things is see.
>> NTP
>> drwxr-x---   2 root root         4096 Dec 28 21:12 ntp_signd
>> should be root:ntp
> No idea why the ownership is incorrect for that directory but I have
> executed the following to fix it:
> sudo chown -R root:ntp /var/lib/samba/ntp_signd
> and now the security settings on that dir look like:
> sudo ls -la /var/lib/samba/ntp_signd/
> total 8
> drwxr-x--- 2 root ntp  4096 Dec 28 21:12 .
> drwxr-xr-x 8 root root 4096 Dec 13 21:07 ..
> srwxrwxrwx 1 root ntp     0 Dec 28 21:12 socket
>> drwxrwx---+  3 root BUILTIN\administrators    4096 Apr 28  2015 sysvol
>> your shows 300000 while mine gives : BUILTIN\administrators
>> but i have winbind/nsswitch etc configured on my DC, dont ask why, but i
> need it, and it works good for me.
> Regarding the SYSVOL permissions, I checked the permissions of
> /var/lib/samba/ on another PDC I have deployed on a different network and
> ntp_signd is owned by root:3000000 as well.
>> Can you tell more about the hardware failure?
>> Disk problems, power outage etc what exact happend?
>> Did you see an filesystem check the first time starting up after the failuere?
> The initial hardware failure was a RAID array failure, I replaced the failed
> devices and rebuilt the array and then rebuilt their domain from scratch
> provisioning under a new domain.
>> I asume its the only server, do no other DC's.
> Yes, that is correct, this machine is the only domain controller on this
> network.
>> Stop all samba processes and backup at least these folders.
>> /etc/samba
>> /var/lib/samba
>> /var/cache/samba
> Samba fails at boot, I've already made a couple of safety backups but for
> good measure I stopped smbd, nmbd, and samba services and backed up the
> directories you listed.

Just how are you starting Samba ? If you are running Samba as an AD DC, 
you should only start the samba deamon, yet you say that you 'stopped 
smbd, nmbd, and samba services', 'nmbd' should not be running on an AD 
DC, it interferes with 'nbt' built into the samba deamon.

>> When you run :  samba-tool fsmo show
>> You probely get an error...
> I do receive an error, note I did not start any of the aforementioned
> services prior to executing the samba-tool command below:
> sudo samba-tool fsmo show
> ldb_wrap open of secrets.ldb
> ERROR(assert): uncaught exception
>    File "/usr/lib/python2.7/dist-packages/samba/netcmd/__init__.py", line
> 175, in _run
>      return self.run(*args, **kwargs)
>    File "/usr/lib/python2.7/dist-packages/samba/netcmd/fsmo.py", line 196, in run
>      assert len(res) == 1

Known problem that I have fixed in master, mind you, your version of 
fsmo.py will only show 5 of the seven roles. Your problem seems to be 
that at least one of your FSMO roles doesn't have a roleowner, hence 
when the python code says it has (assert len(res) == 1), it throws an error.

>> , so try the following.
>> samba-tool fsmo sieze
> I receive a second error when executing the seize command:
> sudo samba-tool fsmo seize
> ldb_wrap open of secrets.ldb
> ERROR: Invalid FSMO role.
>> ( i dont think i will work, but give it a try, any outputs is most welkom  )
>> These do worry me.
>> Failed to find object DC=one,DC=cliffbells,DC=com for attribute
> fsmoRoleOwner - Cannot find DN
>> DC=one,DC=cliffbells,DC=com to get attribute fsmoRoleOwner for reference
> dn: (null)
>> ./source4/dsdb/common/util.c:1877(samdb_is_pdc)
>>    Failed to find if we are the PDC for this ldb: Searching for
> fSMORoleOwner in DC=one,DC=cliffbells,DC=com
>> failed: Cannot find DN DC=one,DC=cliffbells,DC=com to get attribute
> fsmoRoleOwner for reference
>> dn: (null)
>> which looks like you samba DB is corrected, probely due to the hardware
> failure.
> If your hunch that the database is corrupt holds true it couldn't be from
> hardware failure as this domain was provisioned after that incident.  I do
> believe I may have traced where any possible corruption might have
> originated though...  I (apparently foolishly) started backing up
> /var/lib/samba with CrashPlan after the hardware failure incident... I'm
> guessing that was a bad idea.

As far as I am aware, you cannot backup a running Samba AD DC with 
anything that doesn't use tdbbackup, unless you stop samba.

>> Do you have a backup, made with samba_backup ?
>> ( shown here :
> https://wiki.samba.org/index.php/Backup_and_restore_an_Samba_AD_DC  )
>> Because i think you db is corrected and beyond recovery.
> No, I do not have that backup mechanism implemented, and from reading that
> wiki page's notes about backing up live databases I have come to the
> conclusion that CrashPlan backed up /var/lib/samba/ while the databases were
> live and irreparably damaged them.  I don't know what the relationship
> between /var/lib/samba/ and /var/cache/samba/ is exactly, but I assume that
> any backup I had created via CrashPlan (if it had worked instead of wreaking
> havoc) probably wouldn't have been valid lacking the /var/cache/samba/
> directory contents... I will be implementing the Samba backup script from
> your wiki link immediately on the other Samba ADCs I have deployed and will
> utilize it here when I've rebuilt the domain, using CrashPlan for offsite
> storage of archives it creates.
> Which leads us your closing statement:
>> If you have  backupped :
>> /etc/samba
>> /var/lib/samba
>> /var/cache/samba
>> You can remove the content of
>> /var/lib/samba
>> /var/cache/samba
>> And reprovision, bases on the posts here and the things i see.
>> If you have a backup "any" which have also the samba databases, thats the
> first you can try.
>> Greetz,
>> Louis
> Other than the python error I received after running samba-tool fsmo show, I
> believe I've built a pretty solid case for poor backup strategy being the
> cause of this failure, and that reprovisioning the domain is my only course
> of action at this time.  If you believe I'm getting ahead of myself, or if
> you think that Python error could lead to another failure after I've
> reprovisioned, please let me know.  I intend to execute the new domain
> provisioning tomorrow (Sunday Jan 03 2016) in the late afternoon/early
> evening (EST), and would hate to go through the process of rebuilding their
> infrastructure only to have a Python issue trash the domain again.
> Thanks again Louis et al for helping me troubleshoot this issue, I'm still
> green when it comes to Samba.

One of your problems is that you are using the stock Ubuntu samba, this 
is getting a bit long in the tooth now, can I suggest you use either the 
latest freely available samba from Sernet or better still, compile it 
yourself and use the latest version 4.3.3. This will get you a much 
improved fsmo.py and will also cover you for several CVEs.

> Kind Regards,
> JS

More information about the samba mailing list