samba3 to samba4 migration a success story (detailed version) and thanks a lot to abartlet
Thorsten Trautwein-Veit
thorsten.trautwein-veit at schulergroup.com
Mon May 7 06:29:26 MDT 2012
I want to inform the List about an successful migration of our Samba 3
Domain.
Maybe someone will find some pointers while he is doing something like
us. Please keep in mind all information is dependend on my machines and
must not meet your needs!
I was starting on 07.04.12 (german easter weekend). I had to migrate our
PDC (Samba 3.4.3)/BDC (Samba 3.5.3) with ldap backend, and our main
Filememberserver ( Samba 3.6.1 openldap backend) and two calculation
machines (number crunchers) with the same Samba versions and openldap
backends and one DXM ( Data eXchange Manager)
On our DCs are are 201 User accounts and 117 groups, we have 268
Computer accounts as well. Operating system for all servers is debian
squeeze. Prior to change our working domain i tested all of the upgrade
process in vmware (but i was hit by reality later).
Our PDC is a xen virtual machine and our BDC is real hardware later more
on this.
On Saturday i shutdown any remaining client and did the last backup
before i started it was very handy to have an actual ldif of my openldap
directory. I was following the steps in
https://wiki.samba.org/index.php/Samba4/HOWTO
I started with our PDC
Step 1 - Download Samba4
I used "git clone http://gitweb.samba.org/samba.git samba-master; cd
samba-master"
Hint do not forget to export your proxy anything like this :
"export
http_proxy=http://<username>:<password>@proxy.your-firm.something<:3128>"
this takes a while depending on your Internet bandwidth
Step 2- Compile Samba4
I got compile errors with "./configure.developer
--prefix=/usr/local/samba-4-20120403" and was advised to add the
parameter "--abi-check-disable" via IRC. The main thing was that i am
using an other version of gdb on my system and the abi checks seem to be
tied to a special gdb version. I got the compile error only under my
original x64 environment under 32bit all compiled fine.
Step 3 - Install Samba4
just make install did what it should in my case i was linking my samba
installation to /usr/local/samba for my own convince by doing
"ln -s /usr/local/samba-4-20120408 /usr/local/samba" it is just for
lazyness and easy updating of my installation
I skiped Step 4 - Provision Samba4 to Step 7 because i have to do an
upgrade and not an new install.
So after Step 3 i started with
Step 8 - Configure DNS
Because debians bind packages are to old for my needs i downloaded
bind9_9.8.1.dfsg.P1.orig.tar.gz and compiled it with :
"./configure --with-dlz-dlopen=yes --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --sysconfdir=/etc/bind --localstatedir=/var
--enable-threads --enable-largefile --with-libtool --enable-shared
--enable-static
While --with-dlz-dlopen=yes is essential for the samba
samba/lib/bind9/dlz_bind9.so to do dynamic nameserver updates. "
I edited to my needs :
/etc/bind/named.conf.options:
options {
directory "/var/cache/bind";
// If there is a firewall between you and nameservers you want
// to talk to, you may need to fix the firewall to allow multiple
// ports to talk. See http://www.kb.cert.org/vuls/id/800113
// If your ISP provided one or more IP addresses for stable
// nameservers, you probably want to use them as forwarders.
// Uncomment the following block, and insert the addresses replacing
// the all-0's placeholder.
forwarders {
153.3.XX.XX;
};
auth-nxdomain no; # conform to RFC1035
listen-on-v6 { any; };
allow-recursion { any; };
allow-query { any; };
allow-query-cache { any; };
tkey-gssapi-keytab "/usr/local/samba/private/dns.keytab";
//tkey-gssapi-credential "DNS/wzbgpdc1.sctg.schuler.de";
//tkey-domain "SCTG.SCHULER.DE";
};
the forwarder statement is our man enterprise DNS Server which delegates
all zones. This is needed to resolve all other IPs. Every thing else is
like it is described in the howto
/etc/bind/named.conf.local:
//
// Do any local configuration here
//
// Consider adding the 1918 zones here, if they are not used in your
// organization
//include "/etc/bind/zones.rfc1918";
include "/usr/local/samba/private/named.conf";
};
/usr/local/samba/private/named.conf:
# This DNS configuration is for BIND 9.8.0 or later with dlz_dlopen support.
#
# This file should be included in your main BIND configuration file
#
# For example with
# include "/usr/local/samba-4-20120408/private/named.conf";
#
# This configures dynamically loadable zones (DLZ) from AD schema
#
dlz "AD DNS Zone" {
database "dlopen /usr/local/samba/lib/bind9/dlz_bind9.so";
};
and at least
/etc/resolv.conf:
domain sctg.schuler.de
search sctg.schuler.de
nameserver 127.0.0.1
Step 9 - Testing kerberos
for kerberos i linked in /etc/krb5.conf to
/usr/local/samba/private/krb5.conf
"ln -s /usr/local/samba/private/krb5.conf /etc/krb5.conf"
krb5.conf:
[libdefaults]
default_realm = SCTG.SCHULER.DE
dns_lookup_realm = false
dns_lookup_kdc = true
Step 10 - Configure kerberos DNS dynamic updates (optional)
was done in Step 8 allready
Step 11 - Configure NTP (optional)
i think you really need ntp for an Samba4 Installation with more than
one member. So in my opinion and setup having ntp work right is a must
on every member server of my Samba 4 domain because kerberos is clock
dependend.
/etc/ntp.conf
server timesrerver.your-company.whatever
driftfile /var/lib/ntp/ntp.drift
server 127.127.1.1 version 3
fudge 127.127.1.1 stratum 12
After this i started with
https://wiki.samba.org/index.php/Samba4/samba3upgrade/HOWTO
to migrate my existing Users, Groups and Machineaccounts. I followed the
"Upgrading in Place" guide because i had tested it in my vmware Network.
I had to clean my ldapdirectory a little bit, while
/usr/local/samba/bin/samba-tool domain samba3upgrade was complaining
about different things like double used uid(s), i had more then one
root account in it and stuff like this.
To test the upgrade i copied all Samba3 "*.tdb" files to "/tmp/tdb" and
tested the upgrade with
"./samba-tool domain samba3upgrade --dbdir=/tmp/tdb --use-xattrs=yes
--realm=sctg.schuler.de /usr/local/samba_3.4.3/lib/smb.conf"
and fixed my ldap one by one.
Once i imported all Users, Machines successfully i deleted my Samba4
Install and installed it again and upgraded it one last time.
This all took round about 4 hours of work, but is depending on your
Internet connection, the compute power of your machine and things.
Our BDC was installed following
https://wiki.samba.org/index.php/Samba4/HOWTO without doing a provision,
because all information should be replicated to my second domain controller.
I edited my smb.conf file to meet my domain declaration on my PDC which
was made by the samba3upgrade process.
PDC smb.conf:
# Global parameters
[global]
server role = domain controller
workgroup = SCTG
realm = sctg.schuler.de
netbios name = WZBGPDC1
passdb backend = samba4
server string = sctg ad dc1
log level = 1
domain logons = yes
wins support = yes
private dir = /usr/local/samba/private
ncalrpc dir = /usr/local/samba/var/run/ncalrpc
winbindd socket directory = /usr/local/samba/var/run/winbindd
winbindd privileged socket directory =
/usr/local/samba/var/lib/winbindd_privileged
ntp signd socket directory = /usr/local/samba/var/run/ntp_signd
dns update command = /usr/local/samba/sbin/samba_dnsupdate
spn update command = /usr/local/samba/sbin/samba_spnupdate
samba kcc command = /usr/local/samba/sbin/samba_kcc
lock dir = /usr/local/samba/var/lock
state directory = /usr/local/samba/var/locks
cache directory = /usr/local/samba/var/cache
pid directory = /usr/local/samba/var/run
wins server =
[netlogon]
path = /usr/local/samba/var/locks/sysvol/sctg.schuler.de/scripts
read only = No
[sysvol]
path = /usr/local/samba/var/locks/sysvol
read only = No
On our BDC:
# Global parameters
[global]
server role = domain controller
workgroup = SCTG
realm = sctg.schuler.de
netbios name = WZBGPDC2
passdb backend = samba4
log level = 2
[netlogon]
path =
/usr/local/samba-4-20120408/var/locks/sysvol/sctg.schuler.de/scripts
read only = No
[sysvol]
path = /usr/local/samba-4-20120408/var/locks/sysvol
read only = No
BDC /etc/krb5.conf :
[libdefaults]
default_realm = SCTG.SCHULER.DE
dns_lookup_realm = true
dns_lookup_kdc = true
[realms]
SCTG.SCHULER.DE = {
kdc = wzbgpdc1.sctg.schuler.de:88
admin_server = wzbgpdc1.sctg.schuler.de:749
default_domain = sctg.schuler.de
}
[domain_realm]
.sctg.schuler.de = SCTG.SCHULER.DE
sctg.schuler.de = SCTG.SCHULER.DE
bdc the /etc/resolv.conf:
domain sctg.schuler.de
search sctg.schuler.de
nameserver 153.3.xxx.xxx
is actually pointing to the PDC bind
I joined the BDC to the running PDC with : "samba-tool domain join
sctg.schuler.de DC -Uadministrator%<password> --realm=sctg.schuler.de -d2"
After starting Samba 4 on the BDC with :
"./samba -i -M single -d2" i have seen that both DCs started replicating
by using "samba-tool drs showrepl" have a close look on the line
starting with :
0 consecutive failure(s). It takes some time to replicate all Data from
one to the other DC.
To have an redundant (Samba 4 AND Nameserver Setup) i tried what was
suggested in the samba-technical mailing list and performed an :
samba_upgradedns.
This was the first what was not working. I was seeing in the log that
"dreplsrv_partition[DC=DomainDnsZones,DC=sctg,DC=schuler,DC=de] loaded"
started to replicate my DNS Zones, but i got no DNS records. I must say
that i don't use much time to solve this step. I let it broken because i
thought i need my time to migrate more of the domain. I was over the
point of no return.
[Later i was following an threat on the technical mailing list where two
Daniele and Adreas where fighting the same problem and don't solve it.
Is this right? I think so.]
Then it was time to migrate the Logonscripts i copied them from my old
netlogon share to the new netlogon share. In the first shot i forgot to
set the right owner of the logonscript and the acls to let the edv group
edit the file in any case. Keep in mind that it is your job to replicate
the contens of the sysvol share. I use unison
http://www.cis.upenn.edu/~bcpierce/unison/ but maybe csync is easier and
will do the same job http://www.csync.org/.
Then i created my init.d samba4 start script.
/etc/init.d/samba:
#!/bin/sh
### BEGIN INIT INFO
# Provides: samba
# Required-Start: $network $local_fs $remote_fs
# Required-Stop: $network $local_fs $remote_fs
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: start Samba daemons
### END INIT INFO
#
# Start/stops the Samba daemon (samba).
# Adapted from the Samba 3 packages.
#
SAMBAPID=/var/run/samba/samba.pid
# clear conflicting settings from the environment
unset TMPDIR
# See if the daemon and the config file are there
test -x /usr/local/samba/sbin -a -r /usr/local/samba/etc/ || exit 0
. /lib/lsb/init-functions
case "$1" in
start)
log_daemon_msg "Starting Samba 4 daemon" "samba"
if ! start-stop-daemon --start --quiet --oknodo --exec
/usr/local/samba/sbin/samba -- -D; then
log_end_msg 1
exit 1
fi
log_end_msg 0
;;
stop)
log_daemon_msg "Stopping Samba 4 daemon" "samba"
start-stop-daemon --stop --quiet --name samba $SAMBAPID
# Wait a little and remove stale PID file
sleep 1
if [ -f $SAMBAPID ] && ! ps h `cat $SAMBAPID` > /dev/null
then
# Stale PID file (samba was succesfully stopped),
# remove it (should be removed by samba itself
IMHO.)
rm -f $SAMBAPID
fi
log_end_msg 0
;;
restart|force-reload)
$0 stop
sleep 1
$0 start
;;
*)
echo "Usage: /etc/init.d/samba
{start|stop|restart|force-reload}"
exit 1
;;
esac
exit 0
It is mostly "lend" from
http://www.bryanpopham.com/tutorials/Samba4PDCWin7WinXP.html#make with
many thanks.
After i checked that i am able to login to a workstation with my (old)
password and that my logon script was executed i started to migrate our
main file server.
I updated the file server to Samba 3.6.3 at this time and got a bunch of
problems.
While my attention was focused on Samba 4 i missed that the idmap
backend syntax and usage changed. So i tried without success to
integrate the main fileserver into the domain. I stoped to work on it
after a view hours, when constantly winbind failed to return my newly
ADS Samba4 Users and groups.
Even it was really late i was going home for a sleep and wanted to try
it again on the next day.
On the next day ( 08.04.12 ) my idea (i had no klue why i don't get the
Samba3 on the fileserver running) was that i use Samba 4 on the
fileserver as well. This approach was promising because after 2 hours i
joined our file server to the ADS domain. But while i tried to to
migrate my S3 smb.conf i noticed that Samba 4 does not support "ms dfs"
on which many things depend here.
So i was going back to the mailinglists searching for my issue. I found
this bug : https://bugzilla.samba.org/show_bug.cgi?id=8371 which is
shown to be fixed in Samba 3.6.3 but i had very similar problems. wbinfo
-p worked, wbinfo -t worked, wbinfo -i <username> worked but getent,
chown or chgrp where failing. Because i do not wanted to loose any more
time i was going backward in the git commitlogs searching witch was the
last version of Samba 3 before the change of the idmap interface and
started over with Samba 3.5.13. Disadvantage is that i was not able to
use the smb2 protocol which really speeds up my Windows 7 clients.
With this version i was able do get all user and group informations by
winbindd and using it in commands like chown.
BUT i missed that all uid and gid have changed. So every file on our
file server with the old uid number instead of the username it belongs
to. And even worst all acls had the same problem. I found no easy way to
associate the old ldap uids and gids to the new ADS users so i started
to fix owner and group with this script :
#!/bin/sh
#set -x
#targetdir=/pcdaten/admin
#targetdir=/pcdaten/scan
#targetdir=/pcdaten
#targetdir=/dxm
targetdir=/dat01/simufact
#targetdir=/dat01/simufact
grouplist=`cat /root/gruppenliste.txt`
userlist=`cat /root/userliste.txt`
for item in $grouplist
do
gid=`echo $item | cut -f 2 -d :`
group=`echo $item | cut -f 1 -d :`
find $targetdir -gid $gid -print0 | xargs -0 chown --from=:$gid
:"$group"
done
for item in $userlist
do
uid=`echo $item | cut -f 2 -d :`
user=`echo $item | cut -f 1 -d :`
find $targetdir -uid $uid -print0 | xargs -0 chown --from=$uid
$user
done
The grouplist (gruppenliste.txt) was created by "getent group | cut -f
1,3 -d : >gruppenliste.txt" on an virtual clone of my old PDC. The
userlist was created on the vmware machine by "getent passwd | cut -f
1,3 -d :".
gruppenliste.txt:
domuser:23002
konstr:24003
sctgcmswiki:24006
prj-m:24004
vertrieb:24005
...
userliste.txt:
domuser:23002
konstr:24003
sctgcmswiki:24006
prj-m:24004
vertrieb:24005
arbeitsvorbereitung:24007
....
Note you can use -exec in the script but the use of xargs was much
faster in my case.
To get my acls back i used "getfacl -R -s -p /pcdaten/tebis >tebis.acl"
to read all acls under the subtree /pcdaten/tebis and store it in an
file, in this case tebis.acl. To replace the old uid/gid i used this
script :
sctgfs01:~/berechtigungenKorrigieren# setacls.sh tebis.acl
#!/bin/sh
#set -x
grouplist=`cat /root/berechtigungenKorrigieren/gruppenliste.txt`
userlist=`cat /root/berechtigungenKorrigieren/userliste.txt`
cp $1 $1neu.txt
for item in $grouplist
do
gid=`echo $item | cut -f 2 -d :`
group=`echo $item | cut -f 1 -d :`
sed s/"group:$gid"/"group:$group"/g $1neu.txt >$1run.txt
cp $1run.txt $1neu.txt
rm $1run.txt
done
for item in $userlist
do
uid=`echo $item | cut -f 2 -d :`
user=`echo $item | cut -f 1 -d :`
sed s/"user:$uid"/"user:$user"/g $1neu.txt >$1run.txt
cp $1run.txt $1neu.txt
rm $1run.txt
done
The script create a new file in this case "tebis.aclneu.txt" after
checking that everthing is ok (vimdiff tebis.acl tebis.aclneu.txt) a
"setfacl --restore=tebis.aclneu.txt" sets the corrected acl in the
filesystem.
I was doing it for 6 TiB and it took for many many hours.
It was again time to hurry for a short nap at home :)
09.04.12
My college joined into and from now on where were working together to
get everything ready for the 10.04.12 where early shift starts at 04.00
o'clock. I had several file systems where i had to fix my uid/gid
informations and was facing the fact that winbind stoped working out of
a sudden. The processes where running but no domain info was returned.
Stopping and starting winbindd got it working again but i was fearing
that this will happen (and it happens from time to time) while my users
work.
So i wrote a little monitoring script for winbind.
/usr/local/bin/checkwinbind.sh:
sctgfs01:~/berechtigungenKorrigieren# cat /usr/local/bin/checkwinbind.sh
#!/bin/sh
#set -x
lokal=`cat /etc/group | wc -l`
netz=`getent group | wc -l`
#echo $lokal
#echo $netz
if [ ! $netz -gt $lokal ]
then
echo "!! winbind ausgefallen !!"
date
/etc/init.d/winbind stop
sleep 3
winbindcount=`ps -ef | grep /usr/local/samba/sbin/winbind | wc -l`
while [ $winbindcount -gt 1 ]
do
ps -ef | grep /usr/local/samba/sbin/winbind | tr -s ' '
| cut -f 2 -d ' ' | xargs kill -9
sleep 1
winbindcount=`ps -ef | grep
/usr/local/samba/sbin/winbind | wc -l`
done
sleep 1
/etc/init.d/winbind start
fi
It is based on the idea that there are not as much groups in /etc/group
then in the ADS. So i compare the line count of "getent group" and "cat
/etc/group" and assume if "getent group" returns not more lines then
"cat /etc/group" something with winbind is wrong and i kill it and
restart it.
While setfacl was running i took some time to have a deeper look into
the dns replication between my DC1 and DC2 and noticed that my
RIDManager Role was hmmm absend. To check this i used "samba-tool fsmo
show" which returned only an error. But for me the samba 4 was working
so far. So i focused on the most priority topics.
On 10.04.2012 all of my users were able to login and work. Member file
servers where included by nfs V4 (with acls) into the main fileserver
and corrected dfs entries. But i was going deeper into my installation
searching for "where is my RIDManager" and how can i have redundant DNS
services.
To make it short i got really lots of errors (more then 180) while i
was doing an "make quicktest" on my DC1 (xen virtual machine) while the
same sources complete a "make quicktest" on my DC2 with "ALL OK".
On 12.04.12 i was asking on IRC in #samba-technical for help with this
issue. And Mr. abartlet (thanks a lot again) helped me. First it i quiet
uncommon that the quicktest fails, if it does somethin wired is going
on. It took me endless painfull recompiles on both DCs to find the
following out:
I was using an 2.6.35 Kernel on the XEN host and an 2.26.26-2 kernel in
my DC2 (domu). Getting a more actuell kernel into my domU DC1 solved a
view issues the "make quicktest" was throwing. The real problem where
the broadcom adapter i was using. I used tcp checksum offloading by the
networkcard which don't work for me. I switched it off by using
"ethtool" in
/etc/rc.local :
/sbin/ethtool -K eth0 tx off
/sbin/ethtool -K eth0 rx off
/sbin/ethtool -K eth0 gso off
/sbin/ethtool -K eth1 tx off
/sbin/ethtool -K eth1 rx off
/sbin/ethtool -K eth1 gso off
And from this time on i was able to do an "make quicktest" on this
machine without any hassle. Mr Abartlet patched the "samba-tool dbcheck
--fix" for me to get my RIDManager Role back. And that is what i use up
to date.
I am working on an idmap backend ldap with Samba Version 3.6.4 to
integrate my other fileservers as well and get rid of the nfs mounts
which i don't like. I have two of my servers using an openldap server as
backend. The improtant lines in smb.conf are:
[global]
dos charset = ISO8859-1
unix charset = ISO8859-1
workgroup = SCTG
netbios name = wzbgpsf1
security = ads
realm = SCTG.SCHULER.DE
winbind enum users = yes
winbind enum groups = yes
winbind use default domain = no
ldap idmap suffix = ou=idmap
ldap ssl = no
idmap backend = ldap
idmap config * : range = 700001 - 800000
idmap config * : backend = tdb
idmap config sctg : backend = ldap
idmap config sctg : range = 40000-700000
idmap config sctg : ldap_url = ldap://sctgfs01.schuler.de
idmap config sctg : ldap_base_dn =
ou=idmap,dc=sctg,dc=schuler,dc=de
idmap config sctg : ldap_user_dn =
cn=admin,dc=sctg,dc=schuler,dc=de
template homedir = /homeu/%U
template shell = /bin/bash
wins server = 153.3.131.119
max protocol = smb2
But this needs more testing. It was not possilbe to get the ldap
populatetd with an /usr/local/samba/var/locks/winbindd_cache.tdb file so
i had to do it by hand.
I was using a script like on my openldapserver:
sctgfs01:~/ldap/changeid# cat change_idmap.sh
#!/bin/sh
set -x
SAMBABIN=/usr/local/samba/bin
pwdlist=`getent passwd`
userlist=`$SAMBABIN/wbinfo -u`
for user in $userlist
do
sid=`$SAMBABIN/wbinfo -n $user | cut -f 1 -d ' '`
uid=`getent passwd | grep $user: | cut -f3 -d :`
dn=`ldapsearch -LLL "(sambaSID=$sid)" -x dn >$uid.ldif`
head --lines=2 $uid.ldif >$uid-2.ldif
echo "changetype: modify" >>$uid-2.ldif
echo "replace: uidNumber" >>$uid-2.ldif
echo "uidNumber: $uid" >>$uid-2.ldif
ldapmodify -w LuckyStrice -D cn=admin,dc=sctg,dc=schuler,dc=de
-x -f $uid-2.ldif
done
To check if user and groups match my fileserver i use this script :
wzbgpsf1:~# cat checkidmap.sh
#!/bin/sh
set -x
mkdir -p /dat01/simufact/tht/test/user/
userlist=`wbinfo -u`
for user in $userlist
do
touch /dat01/simufact/tht/test/user/$user
chown $user /dat01/simufact/tht/test/user/$user
done
mkdir -p /dat01/simufact/tht/test/gruppen/
grouplist=`wbinfo -g`
for group in $grouplist
do
touch /dat01/simufact/tht/test/gruppen/$group
chgrp $group /dat01/simufact/tht/test/gruppen/$group
done
I mount the filesystems i have created and chown/chgrp the files via
nfs on my fileserver and check if the filename matches the owner/group
of the file.
When i am shure that the idmap in openldap works i will upgrade all
member file servers to this configuration and Samba Version.
As soon as possible i will update my Samba 4 domain controllers when DNS
replication is icluded or "Group Policy Preferences" are aviable.
I hope this post will help someone while he is migrating to samba 4. If
someone has any further questions about this post contact me on IRC my
nick is ttv.
I want to thank the samba team for their good work and that they have an
open ear for an normal administrator.
Thanks to abartlet for his help.
Cheers all,
have a good time.
--
Mit freundlichen Grüßen · Best regards
*Dipl.-Ing. (FH) Thorsten Trautwein-Veit*
/Leitung EDV · IT-Manager/
More information about the samba-technical
mailing list