Greyhole VFS module(s)

Guillaume Boudreau samba at
Thu Nov 25 19:29:15 MST 2010

On Thu, Nov 25, 2010 at 17:25, Holger Hetterich <hhetter at> wrote:
> Am Donnerstag, den 25.11.2010, 17:13 -0500 schrieb Guillaume Boudreau:
>> On Thu, Nov 25, 2010 at 16:56, Jeremy Allison <jra at> wrote:
>> > On Thu, Nov 25, 2010 at 03:18:32PM -0500, Guillaume Boudreau wrote:
>> >>
>> >> Logging to a flat file allows Samba to work even if the actual
>> >> Greyhole daemon is offline, for whatever reason.
>> >> I didn't want to 'loose' file operations because the daemon was not
>> >> running, or worse, prevent users from modifying files when that
>> >> happened.
>> >> The daemon, when it's running, can then parse those logs at whatever
>> >> speed it can, and act on them accordingly.
>> >
>> > Why go through syslogd then ? Why not have Samba
>> > write into a sqllite database, or maybe a transactional tdb
>> > that the daemon can access simultaneously if it's running ?
>> >
>> > Right now you're using syslogd as an intermediate daemon
>> > already to linearize the logs. Might be possible to build
>> > this into the vfs module.
>> Greyhole, when it read the syslog, do some processing and then inserts
>> the rows in MySQL (or SQLite; configured in greyhole.conf). So yes,
>> cutting the middle-man and inserting directly in the database from the
>> VFS module would be very nice.
>> Sadly, I'm not proficient enough in C to code this in the VFS module.
>> Thus why my module writes in syslog (the simplest C code possible!),
>> to be inserted in the DB by the Greyhole daemon.
> The sqlite3 C interface is relativly simple, and perfect for the module,
> as it could do anything (create the database... etc) on its own. You
> would use sqlite3 >= 3.7.0 which allows WAL (Write Ahead Logging), then
> you can have multiple readers while a writer (the module) is accessing
> the database.

Greyhole, as it is, reads the syslog, and insert 'tasks' in a MySQL or
SQLite database (per greyhole.conf).
Since the greyhole daemon isn't multi-threaded, this parsing happens
between each task processing.
So it's something like:
- parse syslog > insert new tasks in tasks table
- get first task from tasks table > execute it > delete it from tasks table
- parse syslog > insert new tasks in tasks table
- get first task from tasks table > execute it > delete it from tasks table
and so on.

There are many optimizations path possible here.

Option 1. Have the VFS module insert directly in the MySQL or SQLite
tasks table.
This would be the most complicated one to code, but would simplify the
daemon code the most (no more logs parsing).
This would also move some of the business logic from the Greyhole
daemon into the VFS module... Not sure how 'clean' that is...

Question: Can the VFS module access the greyhole.conf file easily, to
parse it and extract the DB-connection settings from ?
I don't think this parsing should happen each time the VFS module is
called, so maybe there's a way to parse it once, and use the result
when file operations needs to be logged ? I know one can provide
parameters to the VFS module in the smb.conf, but I doubt forcing the
user to repeat DB connection settings there, and for every share,
would make sense...

Option 2. Replace syslogd by a 'raw' sqlite or tdb, but continue to
have the Greyhole daemon consumes those logs as it consumes syslog
currently, i.e. read from raw db, and insert in own tasks table.

Question: If writing to sqlite3: I don't think sqlite3 supports
multiple threads writing in the same database simultaneously (does
it?) As such, if a client (or multiple clients) are writing files
simultaneously, one of the log operation would fail, unless the
sqlite3 C interface takes care of that (does it?)

Maybe all this would be better discussed on IRC ?
Are you guys on there often during weekdays ? I can be there between
9:30am and 15pm (GMT-5) every weekday for the next couple of weeks.

More information about the samba-technical mailing list