[Samba] Possible memory leak in

Andrew Bartlett abartlet at samba.org
Sun Sep 12 20:06:00 UTC 2021


On Sun, 2021-09-12 at 12:48 -0700, Matt Oursbourn via samba wrote:
> What I have noticed from looking at the Processes tab in System
> Monitor on
> 
> the server:
> 
> The server is on a 10GB network.  When I start a chia plot (~106Gb
> file)
> 
> transfer from a client computer that is also connected to the 10Gb
> network
> 
> the transfer starts off at ~1000Mb/s until the samba process is
> larger than
> 
> 32.5GiB. The largest I have seen is 39GiB.   The transfer rate then
> drops
> 
> down to the hdd write speed of ~170Mb/s.  That samba process never
> gets any
> 
> smaller than 32.5GiB.  Even after the transfer is complete.

So one thing I would point out is that in unix memory management, the
typical behaviour of malloc() is to be built on a low-level OS
primitive called 'brk'.  This is a high-watermark allocated by the
kernel for the heap, the place where memory returned from malloc()
lives.

Once that memory is touched, it is owned by the process until process
death.  free() will only return it to the pool, not to the OS. 

So while I'm not saying everything is normal - Samba should not be
holding GBs of files in memory waiting to write to the disk I would
have thought - that is why even if we do a proper cleanup the process
size won't ever shrink.

However, what will shrink is the report from 'smbcontrol pool-usage',
because our internal tracking will be that the memory is no longer
allocated. 

What does this all mean?

It means that the diff of that output between when the write is
intensive and later, when the process is large but the write is
concluded, would show what memory we were holding at write time, which
might be excessive.

I don't work on the file server, but perhaps we should throttle
somehow?  We used to be pretty syncronous with our write() path, but I
understand this is is now async workers.

Given the lower 'leak' result with the slower NIC, the mismatch between
the network speed and the disk speed seems the likely trigger.

One more thing: We can probably eliminate the newer io_uring code as
that isn't in 4.11 where you first saw this.

Andrew Bartlett

-- 
Andrew Bartlett (he/him)       https://samba.org/~abartlet/
Samba Team Member (since 2001) https://samba.org
Samba Team Lead, Catalyst IT   https://catalyst.net.nz/services/samba

Samba Development and Support, Catalyst IT - Expert Open Source
Solutions




More information about the samba mailing list