Proposal: libsmbclient API

Tue Mar 31 17:05:05 GMT 2009

On Sat, Mar 28, 2009 at 7:03 PM, Derrell Lipman <
derrell.lipman at unwireduniverse.com> wrote:

>
> I've been researching this and I think the solution we've been discussion
> is inadequate. The problem is that in many cases, errno gets set down in the
> bowels of samba core code, particularly by the C library as called by the
> functions in lib/system.c and lib/select.c. Just having
> libsmbclient-specific code set the errno value in a context variable isn't
> going to solve the problem. Any time errno is passed up from the "primitive"
> (sys_) functions, there's the possibility that a thread switch can overwrite
> its value. What's needed is a more general solution that protects the
> critical sections of code in the primitives where errno can be set/modified,
> and saves the errno value in thread-specific memory before releasing the
> lock on the critical section. This is similar to what the OpenSSL locking
> code does. Whatever implementation we use can be added to both the primitive
> functions and to the libsmbclient code (i.e. wherever errno is possibly set
> and expected to live unaltered). Particularly in the primitive functions,
> the implementation should operate as it does currently if no critical
> section handlers have been provided (as will likely be the case in smbd
> which probably doesn't require them).
>

The problem is even worse than what I described. As an example, assume there
are two threads. We could protect errno in these threads by having each
primitive function (i.e. system call that can set errno) protected by
critical section code (semaphore, mutex, etc.). So thread A issues a
MUTEX_BEGIN() and issues a write() call.  Thread B, at about the same time,
when it wants to issue its own write() call, so it issues a MUTEX_BEGIN()
which blocks until thread A issues its MUTEX_END() call upon completion of
the write(). That's all good and fine, since write() doesn't typically block
for long.

Now enter something like select(). If thread A were to obtain a lock and
enter select(), it could block for a long time. Thread B could in theory do
a select() that would return immediately, but is blocked from issuing its
select call() by thread A's long wait.

To solve this, we have a few choices:

1. Assume a proper POSIX Threads implementation that guarantees
thread-specific errno. We then just need to ensure that the application (and
of course libsmbclient) doesn't call any of the prohibited, i.e.
known-non-thread-safe, functions.

2. Rewrite the low-level code and its callers to specifically allow for a
common call to system calls such as select() based on the needs and requests
of multiple threads. e.g. Thread A wants to issue a select() on file
descriptors 4 and 6; Thread B wants to issue a select() on file descriptors
5 and 7; so the low-level code combines these into a select() for file
descriptors 4, 5, 6, and 7, and dispatches to the proper thread based on the
return value. With this implementation, the errno value could (with
difficulty) be propogated to the relevant threads.

3. Somehow base the locking code on file descriptor, so Thread A could block
on a mutex for fd=4 and fd=6; Thread B could block on a mutex for fd=5 and
fd=7; and they could each issue their own select() calls. The original
problem of errno remains, though.

Threads are evil. There is almost no time that dealing with the concurrency
issues is worth the pain in terms of efficiency. I'm willing to work towards
the goal of a thread-safe libsmbclient, but this is going to be a royal
PITA. I'd prefer to just remind people that threads are evil and they'll
spend way more time debugging their evil programs if they insist on going
with threads. Sigh.

Suggestions?

Derrell