DO NOT REPLY [Bug 5482] apply the rsync comparison algorithm
specially to .mov and .mp4 files
samba-bugs at samba.org
samba-bugs at samba.org
Sat May 24 22:46:19 GMT 2008
https://bugzilla.samba.org/show_bug.cgi?id=5482
jamie at shareable.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jamie at shareable.org
------- Comment #2 from jamie at shareable.org 2008-05-24 17:46 CST -------
Matt,
Although it's true the delta-transfer algorithm detects the small changed
regions, and block boundary is not an issue, even the delta-transfer algorithm
is slow for _very_ large files with small changes.
Think about a 4GB media file with 10k of changes near the start. In this, the
delta-transfer algorithm transmits a lot of block checksum data just to do the
comparisons - enough to be substantially affected by network bandwidth.
Alternatively, if the blocks are large to reduce the number of checksums,
transmitting a single block of data is significant.
I doubt if the scenario described in this bug report is all that common. How
often do you change the header of a huge video file, without transcoding the
contents as well? However, if it is, can the delta-transfer algorithm be tuned
better for this by using smaller blocks near the start of the file?
(More generally, a hierarchical delta algorithm (checksums of blocks of
checksums of blocks - in a tree structure, but 2 or 3 levels may be plenty)
would solve this in a general way for a number of things involving small
changes in very large files. If you concatenate all the files and metadata to
make a single structured data stream to be delta-transferred, it may also be a
good optimisation on data sets consisting of large numbers of files with only a
few changed. The logical extreme is transferring a single checksum being
enough to compare the whole data set in just a few bytes, followed by a top
down breadth-first checksum tree traversal. This is what I am attempting in a
project of mine.)
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
More information about the rsync
mailing list