DO NOT REPLY [Bug 5482] apply the rsync comparison algorithm specially to .mov and .mp4 files

samba-bugs at samba.org samba-bugs at samba.org
Sat May 24 22:46:19 GMT 2008


https://bugzilla.samba.org/show_bug.cgi?id=5482


jamie at shareable.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamie at shareable.org




------- Comment #2 from jamie at shareable.org  2008-05-24 17:46 CST -------
Matt,

Although it's true the delta-transfer algorithm detects the small changed
regions, and block boundary is not an issue, even the delta-transfer algorithm
is slow for _very_ large files with small changes.

Think about a 4GB media file with 10k of changes near the start.  In this, the
delta-transfer algorithm transmits a lot of block checksum data just to do the
comparisons - enough to be substantially affected by network bandwidth. 
Alternatively, if the blocks are large to reduce the number of checksums,
transmitting a single block of data is significant.

I doubt if the scenario described in this bug report is all that common.  How
often do you change the header of a huge video file, without transcoding the
contents as well?  However, if it is, can the delta-transfer algorithm be tuned
better for this by using smaller blocks near the start of the file?

(More generally, a hierarchical delta algorithm (checksums of blocks of
checksums of blocks - in a tree structure, but 2 or 3 levels may be plenty)
would solve this in a general way for a number of things involving small
changes in very large files.  If you concatenate all the files and metadata to
make a single structured data stream to be delta-transferred, it may also be a
good optimisation on data sets consisting of large numbers of files with only a
few changed.  The logical extreme is transferring a single checksum being
enough to compare the whole data set in just a few bytes, followed by a top
down breadth-first checksum tree traversal.  This is what I am attempting in a
project of mine.)


-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.


More information about the rsync mailing list