[Samba] Very slow copy for small files from Win 10 to Samba server

Campbell McLeay cmcl at dneg.com
Fri Jul 30 11:51:06 UTC 2021


Hi,

I've found a problem whereby copying many small files from a Windows
10 client to a Samba server. The test case is 1753 files with a size
of 17k on average, total: 29MB. The time taken is around 5 minutes,
whereas from a Linux SMB (or NFS) client the speed is roughly 14
seconds, which is the same time a Windows 7 client can do the copy in.
This issue appears only when writing to the Samba share: copying the
same set of files from the server to the Win 10 client run at the
expected speed of around 14 seconds, so reads are fine. The Samba
server and all the clients use AD as an auth provider, and are joined
to the domain. The server config:

[global]
  realm = AD.COMPANY.COM
  server string = Samba Server Version %v
  workgroup = COMPANY
  debug class = Yes
  debug pid = Yes
  debug uid = Yes
  log file = /var/log/samba/log.%I
  max log size = 0
  load printers = No
  kerberos method = secrets and keytab
  map untrusted to domain = Yes
  ntlm auth = Yes
  security = ADS
  winbind enum groups = Yes
  winbind enum users = Yes
  winbind nss info = rfc2307
  winbind refresh tickets = Yes
  winbind separator = +
  idmap config * :range = 0-200000
  idmap config * : backend = tdb
  include = /etc/samba/smb.conf.client-10.21.60.80
  force create mode = 0777
  force directory mode = 0777


[user_data]
  comment = user_data
  path = /user_data
  read only = No

There is no sign of TX/RX packet loss or other issues on any of the
network paths on the hosts tested.

During the slow copies, I noticed that the thread of the smbd process
on the server that was handling the copy was at or close to 100% CPU
usage. So I ran a perf on that smbd process and found a very high
amount of kerberos library calls. Even when Windows Explorer reported
the file copy as complete, the thread would still be at 100% CPU usage
for some time after, and packets would still be flowing between the
client and server. I enabled level 10 debug on Samba on the server for
the client, and found what I think could be the problem. The Win 10
client appears to try to authenticate as the machine name rather than
the user logged into the machine. Initially, it reports in the log:

  Found account name from PAC: WKS1241$ []
[2021/07/15 10:46:24.967488, 10, pid=17789, effective(0, 0), real(0,
0)] ../auth/kerberos/kerberos_pac.c:411(kerberos_decode_pac)
  Successfully validated Kerberos PAC

But then reports:

[2021/07/15 10:46:24.968308,  3, pid=17789, effective(0, 0), real(0,
0), class=auth]
../source3/auth/user_krb5.c:51(get_user_from_kerberos_info)
  Kerberos ticket principal name is [WKS1241$@AD.COMPANY.COM]
[2021/07/15 10:46:24.968320, 10, pid=17789, effective(0, 0), real(0,
0), class=auth]
../source3/auth/user_krb5.c:83(get_user_from_kerberos_info)
  Domain is [COMPANY] (using PAC)
[2021/07/15 10:46:24.968332,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:181(Get_Pwnam_alloc)
  Finding user COMPANY+WKS1241$
[2021/07/15 10:46:24.968339,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:120(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as lowercase is company+wks1241$
[2021/07/15 10:46:24.968741,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:128(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as given is COMPANY+WKS1241$
[2021/07/15 10:46:24.969013,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:153(Get_Pwnam_internals)
  Checking combinations of 0 uppercase letters in company+wks1241$
[2021/07/15 10:46:24.969036,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:159(Get_Pwnam_internals)
  Get_Pwnam_internals didn't find user [COMPANY+WKS1241$]!
[2021/07/15 10:46:24.969047,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:181(Get_Pwnam_alloc)
  Finding user WKS1241$
[2021/07/15 10:46:24.969055,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:120(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as lowercase is wks1241$
[2021/07/15 10:46:24.969238,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:128(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as given is WKS1241$
[2021/07/15 10:46:24.969360,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:153(Get_Pwnam_internals)
  Checking combinations of 0 uppercase letters in wks1241$
[2021/07/15 10:46:24.969442,  5, pid=17789, effective(0, 0), real(0,
0)] ../source3/lib/username.c:159(Get_Pwnam_internals)
  Get_Pwnam_internals didn't find user [WKS1241$]!
[2021/07/15 10:46:24.969471,  3, pid=17789, effective(0, 0), real(0,
0), class=auth]
../source3/auth/user_krb5.c:164(get_user_from_kerberos_info)
  get_user_from_kerberos_info: Username COMPANY+WKS1241$ is invalid on
this system
[2021/07/15 10:46:24.969486,  3, pid=17789, effective(0, 0), real(0,
0)] ../source3/auth/auth_generic.c:145(auth3_generate_session_info_pac)
  auth3_generate_session_info_pac: Failed to map kerberos principal to
system user (NT_STATUS_LOGON_FAILURE)
[2021/07/15 10:46:24.969550,  3, pid=17789, effective(0, 0), real(0,
0)] ../source3/smbd/smb2_server.c:3097(smbd_smb2_request_error_ex)
  smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1]
status[NT_STATUS_ACCESS_DENIED] || at
../source3/smbd/smb2_sesssetup.c:134
[2021/07/15 10:46:24.969569, 10, pid=17789, effective(0, 0), real(0,
0)] ../source3/smbd/smb2_server.c:2988(smbd_smb2_request_done_ex)
  smbd_smb2_request_done_ex: idx[1] status[NT_STATUS_ACCESS_DENIED]
body[8] dyn[yes:1] at ../source3/smbd/smb2_server.c:3145

Whereas when the client is Windows 7, the log reports:

Found account name from PAC: joeb [Joe Bloggs]

[2021/07/15 11:25:18.414382,  3, pid=18758, effective(0, 0), real(0,
0), class=auth]
../source3/auth/user_krb5.c:51(get_user_from_kerberos_info)
  Kerberos ticket principal name is [joeb at AD.COMPANY.COM]
[2021/07/15 11:25:18.414398, 10, pid=18758, effective(0, 0), real(0,
0), class=auth]
../source3/auth/user_krb5.c:83(get_user_from_kerberos_info)
  Domain is [COMPANY] (using PAC)
[2021/07/15 11:25:18.414421,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:181(Get_Pwnam_alloc)
  Finding user COMPANY+joeb
[2021/07/15 11:25:18.414429,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:120(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as lowercase is company+joeb
[2021/07/15 11:25:18.414754,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:128(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as given is COMPANY+joeb
[2021/07/15 11:25:18.415176,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:141(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as uppercase is COMPANY+JOEB
[2021/07/15 11:25:18.415487,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:153(Get_Pwnam_internals)
  Checking combinations of 0 uppercase letters in company+joeb
[2021/07/15 11:25:18.415533,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:159(Get_Pwnam_internals)
  Get_Pwnam_internals didn't find user [COMPANY+joeb]!
[2021/07/15 11:25:18.415556,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:181(Get_Pwnam_alloc)
  Finding user joeb
[2021/07/15 11:25:18.415572,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:120(Get_Pwnam_internals)
  Trying _Get_Pwnam(), username as lowercase is joeb
[2021/07/15 11:25:18.415597,  5, pid=18758, effective(0, 0), real(0,
0)] ../source3/lib/username.c:159(Get_Pwnam_internals)
  Get_Pwnam_internals did find user [joeb]!

Throughout the debug logs for the Win 10 client, there are thousands
of the following messages

    Get_Pwnam_internals didn't find user
    NT_STATUS_ACCESS_DENIED
    NT_STATUS_MORE_PROCESSING_REQUIRED
    NT_STATUS_LOGON_FAILURE

which are largely absent (apart from maybe 1 or 2) from the logs of
the Win 7 client. There are also thousands of 'Starting GENSEC
submechanism gse_krb5' messages (again, absent in the Win 7 client
logs), which I guess accounts for the high cpu usage reported by perf
for kerberos library calls.

As a test, I disconnected the share and reconnected to it, selecting
'connect as a different user', to see if I could force the copy to use
my credentials somehow, but the result is the same slow speeds. When I
connect to a Samba server that uses a Samba config that has 'security
= USER' rather than 'security = ADS' it doesn't show the same issue.
It isn't asking for a username and password, and there is no local
user tdb, so I assume it is using AD somehow (though the servers that
have 'security = USER' are not joined to the domain), but this isn't
showing in the log, and PAC isn't mentioned.

The copy does succeed eventually, and the permissions look ok, so
whatever auth failures it is having seem to be overcome, perhaps it is
falling through to some other mechanism?

I've done some web searching to see if I can find why this is
occurring but haven't succeeded. It might well be some behaviour
Microsoft has changed, but I thought I'd start here first since the
issue doesn't arise on Windows->Windows copy.

Thanks for any pointers,

Cam



More information about the samba mailing list