A preliminary design for an external DB for rsync

Sun Sep 23 18:44:21 GMT 2007

I've put some thought into adding DB support to rsync (in a future
release).  This would allow it to maintain some extra information about
files and be able to lookup information rapidly.  This would support
things like caching of checksum information, finding files to hard-link
with, saving file attributes separately from the files (allowing
non-root preservation of full file attributes as well as multiple
attributes per inode).

I imagine adding a single option that specifies a DB config file.  This
option would be in a daemon's config too (with no ability for the remote
user to affect a daemon) and would probably also have an environment
variable equivalent (to allow all rsync commands to be affected).  The
config file would contain info on what DB accessor to use, connect info,
and what sort of information you wish to store in the DB.

I was thinking about the following table structure:

-----

TABLE: disk
 disk_id int32 auto_increment
 devno int64
 comment varchar(64)
PRIMARY KEY: disk_id (unique)
KEY: devno (non-unique)

This table is auto-populated with devno information, as needed.  This
extra indirection allows someone to unmount a disk, set the unmounted
disk's devno to 0 in the table, mount a new disk, and either update the
devno of a disk that was mounted before, or let the new disk get an
auto-generated disk_id (even if it ends up with the same devno as the
just-unmounted disk had before).  We might also want an option to not
allow auto-generated disk_ids, to avoid a mount race condition (having
the DB routines sleep and lookup the devno again).

-----

TABLE: inode_map
 disk_id int32
 ino int64
 size int64
 mtime int64
 ctime int64
 md4 byte(16) NULL-OK
 md5 byte(16) NULL-OK
PRIMARY KEY: disk_id + ino (unique)
KEY: size + md4 (non-unique)
KEY: size + md5 (non-unique)
KEY: size + mtime (non-unique)

This table facilitates the caching of extra info by inode.  It can also
be used to lookup an inode matching certain requirements.  This allows a
link-by-hash algorithm, as well as the finding of alternate basis files.
The checksum keys are not unique because there may be identical files
that aren't hard-linked together (depending on options and hard-link
limitations).

-----

TABLE: name_map
 name_md5 byte(16) (DB-specific)
 name text
 disk_id int32
 ino int64
 mtime int64
 ctime int64
 mode int16
 uid int32
 gid int32
 acls_id int64 NULL-OK (omit?)
 xattr_id int64 NULL-OK (omit?)
PRIMARY KEY: name_md5 (or name) (unique)
KEY: disk_id + ino (unique)

This table allows the caching of file information based on name,
allowing an inode to have multiple instances with differing file
attributes (which is why some of the data duplicates info in the
inode_map table).  The use of a name_md5 field will be DB-specific,
depending on if the database can handle a primary key on a really
long name efficiently.  If not, the DB accessor routine will create
an MD5 checksum of the name and use that as the primary key.  A
database implementation may even choose to store the name in a
separate table with a unique id if that is more efficient for it.
If ACL and extended attribute information is included, it will be
stored as an ID reference to separate tables.

-----

Imagined calls that rsync would use:

db_open(CONFIG_FILENAME_PTR, CHROOT_PATH_PTR, FLAGS);
# CHROOT_PATH_PTR: can be NULL.
# FLAGS: active-checksum-type, incl-acl-info, incl-xattr-info, etc.

The chroot path modifies incoming filenames into a global DB context
and strips the returned filenames down to work in a chroot (also ensures
that no filenames outside the chroot will be returned).

db_stat(FILENAME_PTR, STATX_STRUCT_PTR, CHKSUM_PTR, FLAGS);
# CHKSUM_PTR: can be NULL.  Will be returned if enabled in db_open().
# FLAGS: lstat/stat, use-checksum-for-stat

The stat info is used during the lookup, and then updated.  Stat would
try to handle renamed files by using both filename and inode info,
checking it for accuracy, and updating the DB if a rename had occurred.
(Would not be able to handle a renamed file that had been modified.)

FILENAME_PTR = db_find(PATH_PTR, CHKSUM_PTR, FLAGS, STATX_STRUCT_PTR);
# PATH_PTR: can be NULL, or can specify a desired path prefix.
# CHKSUM_PTR: can be NULL.  Type matches db_open() flags.
# FLAGS: find-any-match, find-a-match-for-hard-linking, require-prefix.

The stat info is used to find a good match, and then updated.  E.g.
could be used by an inc_recurse transfer to find an existing hard-link
somewhere in the destination hierarchy.  Could be used to try to find
a decent basis file or a renamed file.  May want some kind of a fuzzy
matching option.

db_update(FILENAME_PTR, CHKSUM_PTR, FLAGS, STATX_STRUCT_PTR);
# CHKSUM_PTR: can be NULL if doing MD4 checksum w/o --checksum.

db_delete(FILENAME_PTR);

Removes a name from the DB.  I assume that inode information would be
pruned when no names remain that reference the inode.  Deletions would
also happen internally when the code discovered that a file it was
looking up no longer exists.

db_close();

-----

The routines would need to be resilient enough to handle cases where
the DB information is out of date with the filesystem information,
checking as needed, and updating appropriately.

Thoughts?

..wayne..