[PATCHES] freelist defragmentation and (dependent) ctdb repacking
amitay at gmail.com
Fri Jun 13 01:10:21 MDT 2014
On Thu, Jun 12, 2014 at 1:43 AM, Michael Adam <obnox at samba.org> wrote:
> Hi List,
> in the past months, I have been thinking about the tdb freelist
> and vacuuming performance from time to time. I have now managed
> to polish my work on that a bit, so that I can present it.
> One problem is the potential fragmentation of tdb's free list which
> stems from the fact that we can't merge a record that is to be
> inserted into the freelist with its right neighbor (if that is a
> freelist record) but only with its left neighbor. This is due
> to the singly-linked nature of our freelist. We can only merge
> with the left neigbour.
> Now, when we traverse the freelist, we can look left of each
> visited record and possibly merge with that if it is also a
> free record, hence removing the fragmentation.
> I have implemented this in the first patchset:
> - I have created a function "tdb_freelist_merge_adjacent()"
> that does exactly this. So after this function has run, there
> should be no adjacent freelist records any more.
> - I have also created a tdbtool subcommand "merge_freelist" to
> call that function.
> - Because we might not want to bump the version to 1.3.1 for
> this (not sure..), or as a proposal, I have also changed
> tdb_freelist_size() to call tdb_freelist_merge_adjacent()
> so that it automatically defragments.
> - Secondly, there is a place when we traverse the freelist
> anyways, namely in tdb_allocate_from_freelist(). I have
> changed our loop there to merge with left freelist records,
> thereby automatically reducing the freelist fragmentation
> as the database is used. This will usually not traverse until
> the end though since the bes fit algorithm works with decreasing
> - I probably owe a test to measure the effect?!
> The second patchset is for ctdb and builds upon the first one.
> - It changes the vacuum code to always call tdb_freelist_size()
> again before checking whether a repack run is needed, so that
> it automatically defragments the freelist.
> This might reduce the frequency of (blocking) repacks.
> - As a variant, on top there is a patch to explicitly call
> tdb_freelist_merge_adjacent() instead of freelist_size().
> If we choose to have freelist_size defragment, we don't
> need this change, and it is also more backward compatible.
> - Additionally, I made the repack code use the vanialla
> tdb_repack() function and removed the ctdb_repack_tdb one.
> - Finally, some code cleanup is included.
> Review / comment / push appreciated.
> The full code can also be seen in my "master-tdb-freelist" branch
> in git://git.samba.org/obnox/samba/samba-obnox.git,
> (please ignore the few top WIP-commits where I experiment with
> fixes for some build problems)
> Cheers - Michael
The changes look good, but haven't finished reviewing them yet.
One of the issue is that the ctdb changes are not compatible with older tdb
version. Are we making a decision to drop the compatibility with older
versions of tdb?
More information about the samba-technical