Storage compression patch for Rsync (unfinished)
fielker at informatik.fh-augsburg.de
Wed Jan 15 10:51:01 EST 2003
i am using Rsync for making backups of a MySQL database. The MySQL files can
be compressed about 1:10 and i want to make use of this fact.
Rsync currently doesn't support saving files in a compressed state. I
personally think this should be a feature for the filesystem (in the sense of
"synchronised files") but currently there is no such filesystem for Linux
Here my idea:
We will have two new options:
-X : this will specify a compress programm (e.g. gzip, bzip...) - the default
compressor is "gzip"
-Z : this will activate storage file compression.
If "-Z" is enabled. every name (files, directories, links, ...) get's an
extension called ".rsc".
If we have a true file, there is a header section and a data section. The
header section will store the followin attributes:
- magic number
- unpacked size
- packed size
- compress programm (e.g. gzip, bzip2, ...)
- magic number
After the header section we will have the compressed file using the programm
the user gave us with "-X"
Every action in rsync will work - we will some exceptions:
1) Every file objects has the extension .rsc.
2) Doing simple checks (size, etc.) on files. the filesize needs evaluation
for the .rsc header.
3) The local file needs to be decompressed when it is accessed for reading.
4) The local file needs to be compressed after it was modified or created. A
header section needs to be added.
5) The file stats (atime/ctime/mtime) will be applied to the .rsc file. In
1) On Unix this will allow us only files with names 255 - strlen(".rsc") ...
but this might be a very very rare case we will disable compression for this
2) Rsync will need a new option for decompressing and stating the .rsc file
tree. (single file, recursive)
We should also offer options for validating .rsc files and converting a tree
to a .rsc filetree.
I am sending some compressor patches. I am very new to the rsync source, so
here a list of what i did:
- added -X and -Z options (-Z is passed thru a server wenn using
user at host.foo:/directory)
extension ".rsc" is added to every file/directory (in -Z mode)
finish_transfer() now does the compression when in -Z mode before stating the
file. That means the compressed file has the same stat as the uncompressed
I added two new functions:
- storage_decompress: this will decompress an .rsc file to a tmp file, e.g.
for calculating sums (note: a delete function is missing!)
- storage_decompress_update_stats: this will update a given stat structure
with the decompressed filesize of the rsc file.
Currently transfering new files and compressing works. But the receiver
doesn't make use of the stats that storage_decompress_update_stats. I don't
know if i am calling it at the right place. I also don't know if the sum is
allways calculated for a file. If this is the case we need to store the md4
sum in the .rsc header.
Email: fielker at informatik.fh-augsburg.de ICQ: #15582696
A cool os: www.linux.org
PGP Finger-print: C2 8F 7B 55 7B 9B 8C 7E 48 35 48 21 8A DF 01 66
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 4991 bytes
Desc: not available
Url : http://lists.samba.org/archive/rsync/attachments/20030115/894a8ee3/rsync-compress-2.5.5.patch.bin
More information about the rsync