socket-wrapper fd-passing

Anoop C S anoopcs at
Mon Jun 29 13:55:38 UTC 2020

On Wed, 2020-06-24 at 23:14 +0200, Stefan Metzmacher via samba-
technical wrote:
> Hi Anoop,
> I rebased your fd-passing patches on top of socket_wrapper master.
> See

Thanks and sorry for my late reply.

See my comments below:
(WIP branch:;a=shortlog;h=refs/heads/fd-passing-final

> Please also notice my commit on top where I added some more hints
> on a better design:
>   - We need to maintain a small file using mmap and protected
>     by pthread robust mutexes. E.g. one file per local ip address.

My current set of patches operate out of a single file(as db) shared
using mmap() protected by pthread robust mutexes. Do you foresee any
complications with just one file as common db?

>   - The path specified in SOCKET_WRAPPER_FD_PASSING_DB will
>     be used as the file name, if this is not specified we'll
>     use malloc'ed and fd-passing is not enabled.

As of now I have a hard coded db name SOCKET_INFO_DB defined as
/tmp/socket-info.db to hold socket_info structures, free list pointers
and mutex. But the fall back mechanism is not yet present.

>   - The file contains a header (with magic, unique id, size and
> free-list pointer)
>     followed by an array of socket_info structures.

More or less similar format has been used so far. But remember,
version/magic, size etc are still marked as TODO.

>   - The socket_info_fd structures will only maintain the index
>     into the mmap'ed array.

socket_info_fd is now an array of integer pointers containing indexes
to mmaped area of socket_info structures.

>   - fd-passing is limited to fixed number (~ 127), this should be
>     more than enough for typical caller (Samba would just use 1).


>   - In order to do fd-passing of tcp/udp sockets, we'll create
>     a pipe (or similar) where we write an array of with indexes
>     into the mmap'ed array into the write end of the pipe.
>     We would also pass the device/inode and a unique identifier
>     for the file.
>     The read end of the pipe is then passed as the last fd to
>     the destination process. The destination process can rebuild
>     the socket_info_fd structures by reading the array indexes.
>     out of the read end of the pipe.
>   - A tricky part will be the reference counting in the database
>     entries. The sender needs to write the data into the pipe
>     and increment the refcounts in the db file before calling
>     sendmsg(). The sender may hold a mutex for each socket
>     during sendmsg().

I hope this is based on your suggestion to have one file per ip
address. My WIP branch contains changes based on a similar design from;a=blob;f=TODO.fd-passing;h=95f5107c0601720d9a6df81159976363adf7b52b;hb=refs/heads/fd-passing-final

final implementation of real fd-passing is still at an infant stage :-/

>   - In order to allow multiple threads (or processes) to share a
> single
>     socket we need to add mutex protection in quite a few places.

yes..yes, we have decent amount of pthread mutexes now :-)

>     In the most common cases there won't be any contention on the
> mutexes,
>     but it will garantee correctness for the corner cases which
> happens
>     for fd-passing.


In short, I could successfully run `make test` with changes to work out
of a common shared file. Still there is one
test(test_thread_echo_tcp_sendmsg_recvmsg, which I added) failing
almost consistently on my local system and not on GitLab CI. I have
been trying to root cause this failure for quite sometime now(as and
when time allows me) without any luck in fixing it. I would love to see
another pair of eyes reviewing the changes.

> An additional idea would be using temporary anonymous files (maybe
> memfd_create() or an similation for it) for shared structures for
> passed
> sockets. In sendmsg() the sender would move the
> socket_info[_container]
> structures from malloc'ed memory to an temporary memory file.
> This will replace the pipe fd of the original design.
> If multiple fd's are passed, the memory file contains an array of
> socket_info[_container] structures.
> int *socket_fds_idx would be changed to an array of structures
> or we have an additional array to store possible fd for the temporary
> files and have the destructing code lock at it and select between
> free() and munmap()/close().
> This design would not require a named file, like the one specified

Hm.. this is new to me. I will think about it.

> Do you have comments on these additions?
> Thanks!
> metze

More information about the samba-technical mailing list