[Bug 10074] New: rsync reorders --from-files alphabetically

samba-bugs at samba.org samba-bugs at samba.org
Thu Aug 8 13:57:59 MDT 2013


https://bugzilla.samba.org/show_bug.cgi?id=10074

           Summary: rsync reorders --from-files alphabetically
           Product: rsync
           Version: 3.0.6
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: core
        AssignedTo: wayned at samba.org
        ReportedBy: nicholas.man at gmail.com
         QAContact: rsync-qa at samba.org


This is request to add a flag to request that rsync does not automatically sort
--from-files alphabetically, due to its use case with LTFS media.

LTFS is a linear media with exceedingly slow seek times when compared with raw
read/write speed. To do efficient reads, each file on the tape must be read off
in the order it was written. Each file has an extended attribute
(ltfs.startblock) which returns its exact location (startblock) on the tape. In
this manner, you can easily obtain a list of files in the optimal order for
reads (sorted, ascending, in order of their startblock).

rsync's behavior of copying all files on and off in alphabetical order works
fine if all you ever do is one write, as the order is consistent. If you do
anything else to that volume, however, (using, for example, rsync -u to copy
additional files), it will copy them on in alphabetical order but multiple
times. You will end up with n number of a-z writes distributed across 6 minutes
of seek time.

Trying to restore alphabetically after doing more than one copy will result in
one of the least optimal restore paths imaginable, with average read rates
dropping by 50% in my case when compared with copying them one at a time using
rsync. I, however, was copying fairly large files. Any number of smaller sizes
or wider distribution of files, or both will decimate the performance. You
could easily create a file structure for which rsync would blindly shoot back
and fourth across the entire tape for every single file.

Basically: batch file copying with rsync, if it always restores alphabetically,
is useless for LTFS volumes.

While I can write a python script to execute rsync on individual files, it
would be nice to be able to pass an argument to rsync (--maintain-file-order)
which would prevent rsync from taking my optimally ordered list and butchering
the LTFS restore with an alphabetical sort.

--Nicholas Andre

(Reposted from http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=640492)

-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


More information about the rsync mailing list