socket-wrapper fd-passing
Anoop C S
anoopcs at samba.org
Mon Jun 29 13:55:38 UTC 2020
On Wed, 2020-06-24 at 23:14 +0200, Stefan Metzmacher via samba-
technical wrote:
> Hi Anoop,
>
> I rebased your fd-passing patches on top of socket_wrapper master.
>
> See https://gitlab.com/metze/socket_wrapper/-/commits/fd-passing/
Thanks and sorry for my late reply.
See my comments below:
(WIP branch:
https://git.samba.org/?p=anoopcs/socket_wrapper.git;a=shortlog;h=refs/heads/fd-passing-final
)
> Please also notice my commit on top where I added some more hints
> on a better design:
>
> - We need to maintain a small file using mmap and protected
> by pthread robust mutexes. E.g. one file per local ip address.
My current set of patches operate out of a single file(as db) shared
using mmap() protected by pthread robust mutexes. Do you foresee any
complications with just one file as common db?
> - The path specified in SOCKET_WRAPPER_FD_PASSING_DB will
> be used as the file name, if this is not specified we'll
> use malloc'ed and fd-passing is not enabled.
As of now I have a hard coded db name SOCKET_INFO_DB defined as
/tmp/socket-info.db to hold socket_info structures, free list pointers
and mutex. But the fall back mechanism is not yet present.
> - The file contains a header (with magic, unique id, size and
> free-list pointer)
> followed by an array of socket_info structures.
More or less similar format has been used so far. But remember,
version/magic, size etc are still marked as TODO.
> - The socket_info_fd structures will only maintain the index
> into the mmap'ed array.
socket_info_fd is now an array of integer pointers containing indexes
to mmaped area of socket_info structures.
> - fd-passing is limited to fixed number (~ 127), this should be
> more than enough for typical caller (Samba would just use 1).
Ok.
> - In order to do fd-passing of tcp/udp sockets, we'll create
> a pipe (or similar) where we write an array of with indexes
> into the mmap'ed array into the write end of the pipe.
> We would also pass the device/inode and a unique identifier
> for the file.
> The read end of the pipe is then passed as the last fd to
> the destination process. The destination process can rebuild
> the socket_info_fd structures by reading the array indexes.
> out of the read end of the pipe.
> - A tricky part will be the reference counting in the database
> entries. The sender needs to write the data into the pipe
> and increment the refcounts in the db file before calling
> sendmsg(). The sender may hold a mutex for each socket
> during sendmsg().
I hope this is based on your suggestion to have one file per ip
address. My WIP branch contains changes based on a similar design from
https://git.samba.org/?p=anoopcs/socket_wrapper.git;a=blob;f=TODO.fd-passing;h=95f5107c0601720d9a6df81159976363adf7b52b;hb=refs/heads/fd-passing-final
final implementation of real fd-passing is still at an infant stage :-/
> - In order to allow multiple threads (or processes) to share a
> single
> socket we need to add mutex protection in quite a few places.
yes..yes, we have decent amount of pthread mutexes now :-)
> In the most common cases there won't be any contention on the
> mutexes,
> but it will garantee correctness for the corner cases which
> happens
> for fd-passing.
Sure.
In short, I could successfully run `make test` with changes to work out
of a common shared file. Still there is one
test(test_thread_echo_tcp_sendmsg_recvmsg, which I added) failing
almost consistently on my local system and not on GitLab CI. I have
been trying to root cause this failure for quite sometime now(as and
when time allows me) without any luck in fixing it. I would love to see
another pair of eyes reviewing the changes.
> An additional idea would be using temporary anonymous files (maybe
> memfd_create() or an similation for it) for shared structures for
> passed
> sockets. In sendmsg() the sender would move the
> socket_info[_container]
> structures from malloc'ed memory to an temporary memory file.
> This will replace the pipe fd of the original design.
> If multiple fd's are passed, the memory file contains an array of
> socket_info[_container] structures.
> int *socket_fds_idx would be changed to an array of structures
> or we have an additional array to store possible fd for the temporary
> files and have the destructing code lock at it and select between
> free() and munmap()/close().
> This design would not require a named file, like the one specified
> by SOCKET_WRAPPER_FD_PASSING_DB.
Hm.. this is new to me. I will think about it.
> Do you have comments on these additions?
>
> Thanks!
> metze
>
More information about the samba-technical
mailing list