[Samba] Samba 3 member, winbind caching and DC availability

Pekka L.J. Jalkanen pekka.jalkanen at vihreat.fi
Fri May 10 12:14:15 MDT 2013

Hello all,

I've a box running Samba 3.5.6 (Debian Squeeze) that retrieves its user
accounts from AD, using Winbind. The box is receiving incoming mail.
Idmap backend is AD, with rfc2307 schema mode.

Currently it's only accessing one AD DC, and the MTA on the Samba box is
stopped whenever the DC is temporarily offline to prevent rejection of
any incoming mail with "user unknown" status.

However, I'd like to add another DC to the mix, but I'm concerned that
mail could get rejected if the active DC suddenly goes offline and
winbind doesn't switch to another DC promptly enough.

Consider the following scenario:

1. There is an AD account foo. The account hasn't been used for some
time, and it's thus not in winbind's cache. It's possibly not even in
Winbind's idmap cache.
2. There are two AD DCs, A and B.
3. Samba member server C runs Winbind and is currently using the DC A.
4. Hardware fails and the DC A suddenly drops offline.
5. Just few seconds later an e-mail is arriving for foo. The MTA tries
to check for the user.
6. As Winbind is not yet aware of the unavailability of the DC A, it
tries to contact it.

A. Now, in the ideal world this would continue as follows:

7. Winbind can't contact the DC A anymore, so it promptly contacts the DC B.
8. The DC B confirms the existence of foo.
9. The MTA delivers mail for foo.

B. However, I'm afraid that in the real world, the following could result:

7. Winbind frantically tries to contact the DC A, but timeouts and can't
confirm the existence of foo. It tells the MTA that there's no account.
8. The MTA replies sender with a "550 5.1.1 <foo at my.site>... User
unknown" error.
9. After the timeout Winbind finally manages to switch to the DC B, but
the sender has already got the delivery failure message and now thinks
that the address foo at my.site is no longer valid.

I tried to look at the documentation, but didn't find any
recommendations regarding winbind cache settings in situations where
availability is critical. Is it recommended to just disable all Winbind
caching entirely? Or do just the opposite and try to cache as much as
ever possible? What are the practical effects of winbind cache time and
idmap cache time smb.conf options in this situation? Also, are the
caches for all accounts "replenished" every time the cache of any
account expires, or in per-account basis?

And do the idmap cache times even work in a predictable way with this
old Samba, where bug 8658 still unfixed? Or should I just try to upgrade
as soon as possible?

I build a test box similar to the actual box receiving mail (Winbind
cache time was the default (300 seconds) and idmap cache time was set to
86,400 seconds (one day)) and flooded it with messages while at the same
time switching connections to the DCs back and forth. And sure enough, I
did get some delivery errors due to Winbind unavailability, if the
account receiving the mail hadn't been queried after the last winbind
restart and before the DC went offline. So the likelihood of the
scenario 'B' feels all too great.

Any recommendations for avoiding it?

Pekka L.J. Jalkanen

More information about the samba mailing list