[Samba] Memory Leak in rpcd_spoolss?

Wilko Meyer wilko.meyer at iserv.eu
Tue Sep 20 19:35:21 UTC 2022


Hi,

I may've observed a memory leak condition affecting the rpcd_spoolss processes, leading them to cause
OOM-Events on one of the systems I maintain.

The used Samba version is 4.16.1 (4.16.1+dfsg-8~bpo11+1) and the Kernel version is 4.19.0-17-amd64.

~ # awk '/oom_reaper/' /var/log/syslog |  grep -o -P '(?<=\().*(?=\))' | sort | uniq -c
     28 rpcd_spoolss
     
So far there have been 28 OOM-Events in total where the rpcd_spoolss process got terminated by the oom_reaper. 

As processes being terminated by the oom_reaper doesn't necessarily have to be the processes causing the 
OOM situation, I cross-checked this observation by evaluation the top 3 processes consuming most resident 
memory using atopsar.

I choose a timeframe between the last two OOM-Events (which were about 2h apart) as we only store atop 
metrics in a 10m interval, which would've been too narrow (mostly less than 10-20m between each 
OOM-Event for the other Events) to observe the  steadily increase in memory allocation throughout 
the day:

~ # atopsar -b 1500 -e 1700 -G -r /var/log/atop/atop_20220920

iserv  4.19.0-17-amd64  #1 SMP Debian 4.19.194-3 (2021-07-18)  x86_64  2022/09/20

-------------------------- analysis date: 2022/09/20 --------------------------

15:00:02    pid command  mem% |   pid command  mem% |   pid command  mem%_top3_
15:10:02   9479 avast      4% | 19187 rpcd_spo   3% | 31418 squid      3%
15:20:02   9479 avast      4% | 19187 rpcd_spo   4% | 19184 rpcd_spo   4%
15:30:02  19187 rpcd_spo   5% | 19184 rpcd_spo   4% | 19185 rpcd_spo   4%
15:40:02  18671 rpcd_spo   5% | 19185 rpcd_spo   5% | 19187 rpcd_spo   5%
15:50:02  19185 rpcd_spo   7% | 18671 rpcd_spo   6% | 19187 rpcd_spo   6%
16:00:02  19185 rpcd_spo   8% | 19184 rpcd_spo   6% | 19186 rpcd_spo   6%
16:10:02  19185 rpcd_spo   8% | 19184 rpcd_spo   6% | 19186 rpcd_spo   6%
16:20:02  19185 rpcd_spo   8% | 19184 rpcd_spo   7% | 19186 rpcd_spo   6%
16:30:02  19185 rpcd_spo   9% | 19184 rpcd_spo   8% | 19186 rpcd_spo   7%
16:40:02  19185 rpcd_spo  10% | 19184 rpcd_spo   9% | 19186 rpcd_spo   8%
16:50:02  19185 rpcd_spo  11% | 19184 rpcd_spo  10% | 19186 rpcd_spo   9%

It can be observed that the rpcd_spoolss processes are steadily allocating
memory without seemingly freeing it again.

Unfortunately the debug level of rpcd_spoolss is set to the default level 1:

~ # ps aux | grep spoolss
root     15133 29.4  1.4 531056 245720 ?       S    20:06   3:23 /usr/libexec/samba/rpcd_spoolss --configfile=/etc/samba/smb.conf --worker-group=1 --worker-index=2 --debuglevel=1
root     18671  7.2  9.4 1838952 1545112 ?     S    14:35  24:49 /usr/libexec/samba/rpcd_spoolss --configfile=/etc/samba/smb.conf --worker-group=1 --worker-index=0 --debuglevel=1
root     19184  4.8 10.0 1947312 1652344 ?     S    14:35  16:39 /usr/libexec/samba/rpcd_spoolss --configfile=/etc/samba/smb.conf --worker-group=1 --worker-index=1 --debuglevel=1
root     19186  5.1  9.6 1878384 1587856 ?     S    14:35  17:38 /usr/libexec/samba/rpcd_spoolss --configfile=/etc/samba/smb.conf --worker-group=1 --worker-index=3 --debuglevel=1
root     19187  4.9  8.4 1684464 1390200 ?     S    14:35  16:57 /usr/libexec/samba/rpcd_spoolss --configfile=/etc/samba/smb.conf --worker-group=1 --worker-index=4 --debuglevel=1

which means I do only have access to log messages in log.rpcd_spoolss with limited verbosity.

A quick parse of the rpcd_spoolss debug log seems to imply, that the delete_and_reload_printers function
as well as printer_list_get_last_refresh are logging most frequently to the debug log (-n+6304308 as a offset
in tail to only consider todays log entries as l6304308 is the first log entry of this day):

~ # tail -n+6304308 /var/log/samba/log.rpcd_spoolss | grep -o -P '(?<=\.\./).*' | sort | uniq -c | sort -n   
      1 ../librpc/ndr/ndr.c:1150
      1 ../librpc/ndr/ndr.c:630(_ndr_pull_error)
      2 ../source3/lib/smbrun.c:102(smbrun_internal)
      3 ../source3/rpc_client/cli_winreg_spoolss.c:762(winreg_create_printer)
      7 ../source3/rpc_client/cli_winreg_spoolss.c:227(winreg_printer_openkey)
     34 ../source3/rpc_server/spoolss/srv_spoolss_util.c:57(winreg_printer_binding_handle)
     68 ../source3/printing/print_cups.c:1100(cups_job_submit)
    530 ../source3/rpc_server/rpc_worker.c:1105(rpc_worker_main)
   7568 ../librpc/rpc/dcesrv_handles.c:130(dcesrv_handle_lookup)
  25800 ../source3/printing/printer_list.c:255(printer_list_get_last_refresh)
  25800 ../source3/smbd/server_reload.c:73(delete_and_reload_printers)

The exact log messages are:

~ # tail -n+6304308 /var/log/samba/log.rpcd_spoolss | grep -A1 "printer_list_get_last_refresh" | grep -v 'printer_list_get_last_refresh\|--' | sort | uniq -c  
  25800   Failed to fetch record!
~ # tail -n+6304308 /var/log/samba/log.rpcd_spoolss | grep -A1 "delete_and_reload_printers" | grep -v 'delete_and_reload_printers\|--' | sort | uniq -c
  25800   pcap cache not loaded
  
However I am unsure wether the log messages point us into the right direction, as a malfunctioning query as well as a debug
information that a cache hasn't be loaded is rather unlikely a cause of a Memory Leak. Especially when checking this against
the log.rpcd_spoolss of a server without the memory leak behaviour:

~ # grep -o -P '(?<=\.\./).*' /var/log/samba/log.rpcd_spoolss | sort | uniq -c | sort -n     
      1 ../librpc/ndr/ndr.c:1150
      1 ../librpc/ndr/ndr.c:630(_ndr_pull_error)
      1 ../librpc/rpc/dcesrv_auth.c:371(dcesrv_auth_complete)
   1260 ../source3/rpc_server/rpc_worker.c:1105(rpc_worker_main)
   4158 ../source3/printing/printer_list.c:255(printer_list_get_last_refresh)
   4158 ../source3/smbd/server_reload.c:73(delete_and_reload_printers)
   6930 ../librpc/rpc/dcesrv_handles.c:130(dcesrv_handle_lookup)

I was able to observe in samba-ad-dc.services journal, that rpc_host_distribute_clients seems to allocate new clients to 
the rpcd_spoolss as well as rpcd_winreg processes quite frequently within an interval of 5 minutes:

~ # journalctl -u samba-ad-dc.service --since "5 minutes ago" --g="rpc_host_distribute_clients" -o json | jq '.MESSAGE' | sort | uniq -c | sort -n
      1 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_classic to 2524 with 0 clients"
      1 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 26 clients"
      1 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19184 with 26 clients"
      1 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19184 with 27 clients"
      1 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19186 with 25 clients"
      1 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19187 with 25 clients"
      1 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19187 with 26 clients"
      2 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 25 clients"
      2 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 18671 with 26 clients"
      2 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 18671 with 27 clients"
      2 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19184 with 25 clients"
      2 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19186 with 23 clients"
      2 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19186 with 26 clients"
      3 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 18671 with 25 clients"
      3 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19187 with 24 clients"
      5 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_winreg to 1234 with 0 clients"
      6 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19186 with 24 clients"
     10 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 24 clients"
     13 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19186 with 22 clients"
     22 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19184 with 24 clients"
     39 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 18671 with 24 clients"
     45 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19184 with 22 clients"
     53 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19187 with 23 clients"
     77 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 18671 with 22 clients"
     77 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_winreg to 20226 with 0 clients"
     98 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 23 clients"
    109 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 19184 with 23 clients"
    113 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 18671 with 23 clients"
    150 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 14 clients"
    155 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 13 clients"
    162 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 18671 with 21 clients"
    170 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 22 clients"
    179 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 15 clients"
    256 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 16 clients"
    276 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 21 clients"
    323 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 19 clients"
    346 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 20 clients"
    360 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_winreg to 13068 with 0 clients"
    366 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 18 clients"
    392 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_spoolss to 15133 with 17 clients"
   1371 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_winreg to 11438 with 0 clients"
  10316 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_winreg to 18710 with 0 clients"
  
Whereas on the, without a known OOM-Event, comparsion server with a similiar number of users 
and printers there are hardly as many calls of rpc_host_distribute_clients:

~ # journalctl -u samba-ad-dc.service --since "1 day ago" --g="rpc_host_distribute_clients" -o json | jq '.MESSAGE' | sort | uniq -c
      1 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_classic to 593335 with 0 clients"
      2 "rpc_host_distribute_clients: Sending new client /usr/libexec/samba/rpcd_winreg to 593350 with 0 clients"

Could this be caused by a flood of DCERPC bind packets as rpc_host_endpoint_accept_got_bind would call
rpc_host_distribute_clients upon receiving those? The seemingly odd "with 0 clients" message could be caused
by malformed DCERPC packets then? I definitely should observe the network traffic regarding that over time.

I still got reading into the source of rpcd_spoolss as well as rpcd_winreg on my agenda, as well as analyzing what exactly triggers
rpc_host_distribute_clients to get called without actually sending clients to the rpcd_winreg/rpcd_spoolss (the `with 0 clients`
log messages seem odd to me), so I'd hopefully be able to provide more context on what may've caused the memory leak condition.

For now, I'd appreciate pointers into the right direction in terms of debugging the root cause of this, if this behaviour/the 
memory leak cause of the rpcd_spoolss service has been observed before.

Regards,

Wilko Meyer
IServ GmbH



More information about the samba mailing list