[Samba] DC Upgrade from 4.1.7 to 4.6.7

HB hb.transfert at gmail.com
Sat Aug 26 11:47:17 UTC 2017


> -----Message d'origine-----
> De : Andrew Bartlett [mailto:abartlet at samba.org]
> Envoyé : samedi 26 août 2017 14:25
> À : HB; samba at lists.samba.org
> Objet : Re: [Samba] DC Upgrade from 4.1.7 to 4.6.7
> 
> On Sat, 2017-08-26 at 13:46 +0400, HB via samba wrote:
> > > -----Message d'origine-----
> > > De : Andrew Bartlett [mailto:abartlet at samba.org] Envoyé : samedi 26
> > > août 2017 12:40 À : HB; samba at lists.samba.org Objet : Re: [Samba] DC
> > > Upgrade from 4.1.7 to 4.6.7
> > >
> > > On Sat, 2017-08-26 at 12:32 +0400, HB via samba wrote:
> > > > >
> > > >
> > > > Here is the output of samba-tool dbcheck :
> > > > # samba-tool dbcheck
> > > > Checking 1791 objects
> > > > ltdb:
> > > >
> > >
> > >
> tdb(/usr/local/samba/private/sam.ldb.d/DC%3DDOMAINDNSZONES,DC%3D
> > > CIRAD-
> > > > REUNION,DC%3DCIRAD,DC%3DFR.ldb): tdb_rec_read bad magic
> > >
> > > 0xd9fee666 at
> > > > offset=1115322096
> > > >
> > > > ltdb:
> > > >
> > >
> > >
> tdb(/usr/local/samba/private/sam.ldb.d/DC%3DDOMAINDNSZONES,DC%3D
> > > CIRAD-
> > > > REUNION,DC%3DCIRAD,DC%3DFR.ldb): tdb_rec_read bad magic
> > >
> > > 0xd9fee666 at
> > > > offset=1115322096
> > > >
> > > > Checked 1791 objects (0 errors)
> > > > #
> > > >
> > > > But, if I run 'samba-tool dbcheck --cross-ncs' (as suggested in
> > >
> > > Updating_Samba) :
> > > >
> > > > # samba-tool dbcheck --cross-ncs
> > > > ltdb:
> > > >
> > >
> > >
> tdb(/usr/local/samba/private/sam.ldb.d/DC%3DDOMAINDNSZONES,DC%3D
> > > CIRAD-
> > > > REUNION,DC%3DCIRAD,DC%3DFR.ldb): tdb_rec_read bad magic
> > >
> > > 0xd9fee666 at
> > > > offset=1115322096
> > > >
> > > > ERROR(ldb): uncaught exception - Indexed and full searches both
> failed!
> > > >
> > > >   File "/usr/local/samba/lib64/python2.6/site-
> > >
> > > packages/samba/netcmd/__init__.py", line 175, in _run
> > > >     return self.run(*args, **kwargs)
> > > >   File "/usr/local/samba/lib64/python2.6/site-
> > >
> > > packages/samba/netcmd/dbcheck.py", line 136, in run
> > > >     controls=controls, attrs=attrs)
> > > >   File "/usr/local/samba/lib64/python2.6/site-
> > >
> > > packages/samba/dbchecker.py", line 123, in check_database
> > > >     res = self.samdb.search(base=DN, scope=scope, attrs=['dn'],
> > > > controls=controls) #
> > >
> > > Your DB is corrupt.  do you have another replica, or a good backup?
> > >
> > > Andrew Bartlett
> >
> > Hi Andrew,
> >
> > I don't have a replica , this DC is the only one.
> > I have backups but as daily VM backups , I have tested the last one with the
> same errors.
> > I suspect that the DB could be corrupted since a long time, although the AD
> is running ok.
> >
> > What can I do to either correct the DB or export/import the AD ?
> 
> This is where is gets messy.  You can dump the database with tdbdump and
> ldbdump, you can even do an emergency dump with ldbdump.
> 
> However, that doesn't tell you what you are missing, you need to then
> compare a re-constructed database (rebuild each ldb file in the sam.ldb.d/
> directory via ldbadd) with an older backup using ldapcmp to work out which
> records are missing.  Naturally stop Samba before you copy the files, and do
> all this on another VM.
> 
> I helped one organisation (a school) do this once before (which is why
> ldbdump exists), but in the end they just hobbled along until the end of the
> school year and started over.
> 
> How large/critical is this domain?  Also, because it will give a clue, turn up the
> debug level on the join (eg -d5) and see if you get a clue about which record
> is corrupt there.
> 
> Thankfully it is 'just' your DNS partition, and that can be re-created a little
> more easily, as most things in there get fixed when the clients re-register
> DNS, and when samba_dnsupdate is run on the server.
> 
> Also, carefully check your VM infrustructure.  Last time I blamed drdb, which
> was being used for a 'hot/cold' spare system, for not preserving barriers (the
> thing that makes fsync() and the sync command work) and so allowing a
> corrupt DB rather than a safe transaction recovery.  I never found out if my
> suspicions were correct.
> 
> Finally, this is why we suggest backups with the samba_backup script and a
> second DC.  The first will find TDB level corruption, and the second will
> generally not replicate a corrupt object, meaning you notice corruption in real
> time and have a good backup from before it all went bad.
> 
> Sadly VM snapshots do none of these things, and will have partial writes in
> them, as they don't stop Samba writing to the DB.
> 
> I wish you the best with this, and if this is a large or important domain suggest
> you may wish to engage some professional help from our commercial
> support page.
> 
> Sorry,

The domain is about 350 PCs , 300 users and some NAS as file servers. 

The complete output of the -d5 join command on the newdc (i.e. "# samba-tool domain join cirad-reunion.cirad.fr DC -U"CIRAD-REUNION\administrator"  --verbose -d5 --dns-backend=SAMBA_INTERNAL" ) is here : 
 

At the beginning, I can see several errors like this : 
"Failed to get kerberos credentials: kinit for administrator at CIRAD-REUNION failed (Cannot contact any KDC for requested realm)
Cannot reach a KDC we require to contact (null) : kinit for administrator at CIRAD-REUNION failed (Cannot contact any KDC for requested realm)
SPNEGO(gssapi_krb5) creating NEG_TOKEN_INIT for ldap/mafate.cirad-reunion.cirad.fr failed (next[ntlmssp]): NT_STATUS_NO_LOGON_SERVERS"

although I tested the kinit administrator command line with success.

At the end, there is something wrong about secrets.ldb that generates the WERR_DS_DRA_INTERNAL_ERROR.
Here are the last lines of the output : 

<...>
                    nc_linked_attributes_count: 0x00000000 (0)
                    linked_attributes_count  : 0x00000000 (0)
                    linked_attributes        : NULL
                    drs_error                : WERR_OK
            result                   : WERR_DS_DRA_INTERNAL_ERROR
ldb_wrap open of secrets.ldb
Could not find machine account in secrets database: Failed to fetch machine account password for CIRAD-REUNION from both secrets.ldb (Could not find entry to match filter: '(&(flatname=CIRAD-REUNION)(objectclass=primaryDomain))' base: 'cn=Primary Domains': No such object: dsdb_search at ../source4/dsdb/common/util.c:4576) and from /usr/local/samba/private/secrets.tdb: NT_STATUS_CANT_ACCESS_DOMAIN_INFO
ERROR(runtime): uncaught exception - (8442, 'WERR_DS_DRA_INTERNAL_ERROR')
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/__init__.py", line 176, in _run
    return self.run(*args, **kwargs)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/domain.py", line 661, in run
    machinepass=machinepass, use_ntvfs=use_ntvfs, dns_backend=dns_backend)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/join.py", line 1269, in join_DC
    ctx.do_join()
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/join.py", line 1177, in do_join
    ctx.join_replicate()
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/join.py", line 918, in join_replicate
    replica_flags=ctx.replica_flags)
  File "/usr/local/samba/lib64/python2.7/site-packages/samba/drs_utils.py", line 254, in replicate
    (level, ctr) = self.drs.DsGetNCChanges(self.drs_handle, req_level, req)
Adding CN=BENARE,OU=Domain Controllers,DC=cirad-reunion,DC=cirad,DC=fr
Adding CN=BENARE,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=cirad-reunion,DC=cirad,DC=fr
Adding CN=NTDS Settings,CN=BENARE,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=cirad-reunion,DC=cirad,DC=fr
Adding SPNs to CN=BENARE,OU=Domain Controllers,DC=cirad-reunion,DC=cirad,DC=fr
Setting account password for BENARE$
Enabling account
Calling bare provision
Provision OK for domain DN DC=cirad-reunion,DC=cirad,DC=fr
Starting replication
Replicating critical objects from the base DN of the domain
Done with always replicated NC (base, config, schema)
Replicating DC=DomainDnsZones,DC=cirad-reunion,DC=cirad,DC=fr
Join failed - cleaning up
Deleted CN=BENARE,OU=Domain Controllers,DC=cirad-reunion,DC=cirad,DC=fr
Deleted CN=NTDS Settings,CN=BENARE,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=cirad-reunion,DC=cirad,DC=fr
Deleted CN=BENARE,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=cirad-reunion,DC=cirad,DC=fr


Henri 

> Andrew Bartlett
> 
> --
> Andrew Bartlett                       http://samba.org/~abartlet/
> Authentication Developer, Samba Team  http://samba.org
> Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba





More information about the samba mailing list