rsync to iSCSI over WAN

Matthias Schniedermeyer ms at citd.de
Fri Jul 23 13:25:17 MDT 2010


On 23.07.2010 10:29, Tom Christensen wrote:
> 
> 
> 
> > Date: Fri, 23 Jul 2010 10:02:26 +0200
> > From: ms at citd.de
> > To: pavera at live.com
> > CC: rsync at lists.samba.org
> > Subject: Re: rsync to iSCSI over WAN
> > 
> > On 23.07.2010 00:30, Tom Christensen wrote:
> > > 
> > > I am running rsync in cygwin on windows.  I am attempting to backup a 
> > > somewhat large data store (750GB) to a remote site.  As its windows 
> > > and preserving permissions exactly is important, I have an iSCSI drive 
> > > mounted on the local system across a somewhat slow WAN link (IE, it 
> > > would take about 3 months to copy the datastore over it).  
> > > Unfortunately, since this appears as a "local" copy to rsync, it 
> > > always copies whole files.  Even though it is a "local" copy, I want 
> > > to only send diffs, as we have large files that have small changes 
> > > daily.  Reading the man page, and everything I can find on the net I 
> > > don't see an option to force diffs only/rsync protocol, is this 
> > > possible?
> > 
> > You have an abstraction error here, an iSCSI device is just a bit-bucket 
> > like any localy connected HDD in that all filesystem and data processing 
> > is done on the local side and raw block-data is send over the WAN. You 
> > would have to run the other side of rsync on the machine that provides 
> > the iSCSI-Device. For that to work the remote-machine would have to 
> > mount the filesystem localy which in most cases means you would have to 
> > unmount it from your Windows machine. This is because, except for 
> > cluster-filesystems or a filesystem that is mounted read-only (on every 
> > machine!), a given fileystem can only be mounted on 1 machine at a time.
> 
> I am aware of that fact, it is precisely the fact that the filesystem 
> is exposed directly to windows that I need iSCSI.  Mounting an ext3 
> partition shared via smb/cifs (another option on this NAS device) does 
> not provide me with permissions fidelity (as ext3 does not support all 
> of the ntfs permissions.  I was under the impression that rsync would 
> calculate the diffs between the "local" copy and the "remote" (iSCSI) 
> copy, and then only "send" the diffs, IE only the diffs would be 
> written to disk (and therefore only the diffs would be sent via 
> block-io over the WAN to the iSCSI).  But that very well could be a 
> misunderstanding on my part of the way rsync functions at that low 
> level.  Maybe on the remote side it reads through the whole file, 
> writing it out to disk and inserting the diffs it receives?  In which 
> case no matter what the whole file would be sent over block-io?  This 
> functionality would also mean that even at the "network share" level 
> (smb/cifs) at the remote side, you would gain nothing from 
> --no-whole-file because the remote side would do the same thing IE 
> write the whole file to the share anew resulting in the same amount of 
> network traffic.
> 
> At any rate, --no-whole-file appeared to improve performance on the 
> backup last night, running the backup with --no-whole-file resulted in 
> about 75% less data copied (according to the report at the end) and 
> the backup ran in about half the time it was taking previously.  
> Granted it could have just been a slow day in the office and maybe not 
> that many files were changed... a sample size of 1 is not 
> representative..

For calculating the difference rsync still has to read the complete 
files over iSCSI, but has only to write back the changes. Reads with 
long latency hurt much less then writes as when you write (with 
logging-filesystem) you have to wait until a change is actually written 
which results in several round-trips over then WAN to complete a write 
operation. Whereas a read only needs 1 round-trip to complete.

Otherwise you wouldn't get better performance. Because with 
"--whole-file" you just write all files without reading them, and with 
"--no-whole-file" you read the files and write back the changes. Which 
results in totally more data beeing transfered in the 
--no-whole-file-case, but with a much lesser latency/round-trip problem. 
IOW: Reads are much cheaper than writes


Have you tried "--fake-super" for storing the "extra" privileges on the 
ext3-filesystem? At least the man-pages reads like it is what you want, 
altough i personally have not needed it.




Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.



More information about the rsync mailing list