User controlled i/o block size?

Greg Freemyer greg.freemyer at gmail.com
Mon Apr 11 23:33:09 UTC 2016


I'm just doing a local copy:

rsync -avp --progress <source_dir> <dest_dir>

The source and dest are on different spindles.

Some of my copies are a TB or more (I just started one that is 1.5 TB).

It is my assumption (possibly faulty) that rsync is more robust for
handling any aborted copies that have to get restarted after the copy
failed, thus my preference for rsync.

3 performance numbers, all with the exact same drives.  They are USB-3
and I'm moving them between a Windows and Linux computer.

- Robocopy on a beefy Windows box - 105 MB/sec
- rsync on the Windows box - 70 MB/sec
- rsync on an old linux laptop - 90 MB/sec

It seems to me rsync could run faster on both boxes, but 70 MB/sec is
particularly bad.

> Anyways, smaller reads and writes are usually better handled by the
> OS's caches than really big ones.

Exactly.  Watching resource manager in Windows made me think rsync was
reading in the full 1.5 GB file before writing anything.  Maybe it is
just some weird windows kernel behavior?

===
As a test, in Linux I started up 2 rsync's running in parallel.

Different source media, but the same destination (It's a faster drive
than the source media).

I got 120 MB/sec write speeds to the destination in that mode.  Both
of the source drives slowed down to 60 MB/sec to compensate.

I was very pleased with the parallel rsync test.

Greg
--
Greg Freemyer
www.IntelligentAvatar.net


On Mon, Apr 11, 2016 at 7:05 PM, Kevin Korb <kmk at sanitarium.net> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> You didn't say if you were networking or what features of rsync you
> are using but if you aren't networking and aren't doing anything fancy
> you are probably better off with cp -au which is essentially the same
> as rsync -au except faster.
>
> Anyways, smaller reads and writes are usually better handled by the
> OS's caches than really big ones.
>
> On 04/11/2016 07:00 PM, Greg Freemyer wrote:
>> All,
>>
>> One big thing I failed to mention is I was running rsync inside a
>> cygwin windows 8.1 setup.
>>
>> I moved it to a linux box and the behavior is much better.  I get
>> a nice smooth 85-90 MB/sec.  That might be the max speed of the
>> source drive.
>>
>> I'd still like a way to improve rsync's performance in cygwin, but
>> I can understand it is a low priority.
>>
>> Thanks Greg -- Greg Freemyer www.IntelligentAvatar.net
>>
>>
>> On Mon, Apr 11, 2016 at 4:08 PM, Greg Freemyer
>> <greg.freemyer at gmail.com> wrote:
>>> I hope this isn't a FAQ.
>>>
>>> Per the man page I see ways to control the blocksize for hash
>>> comparison reasons, but no way to control it for i/o performance
>>> reasons.
>>>
>>> I'm using rsync to copy folder trees full of large files and I'd
>>> like to have control of how much data is read / written at a
>>> time.  Maybe read 10 MB, write 10 MB, etc.
>>>
>>> Is there an existing way to do that?
>>>
>>> == details ==
>>>
>>> When copying a bunch of 1.5 GB files with rsync, I'm only seeing
>>> 50% of the throughput I hope to see.
>>>
>>> I haven't looked at the code, or even run strace, but it seems
>>> like the code is doing something like:
>>>
>>> while (files)  { read 1.5 GB file to ram write 1.5 GB file from
>>> ram fsync()  ensure 1.5 GB file is on disk } endwhile
>>>
>>> I say that because I see several seconds of high-speed reading,
>>> then no reads.
>>>
>>> When the reads stop, I see writes kick in, then they stop and
>>> reads start up again.
>>>
>>> The end result is I'm only using 50% of the available bandwidth.
>>>
>>> Not that I'm copying my source folder tree to a newly created
>>> folder tree, so there is not any reading of the destination
>>> needed. My ultimate would be something like:
>>>
>>> while (files) { while (data_in_file) { read
>>> user_defined_blocksize to ram from file write
>>> user_defined_blocksize from ram to file } fsync()  ensure 1.5 GB
>>> file is on disk } endwhile
>>>
>>> Thanks Greg -- Greg Freemyer www.IntelligentAvatar.net
>>
>
> - --
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
>         Kevin Korb                      Phone:    (407) 252-6853
>         Systems Administrator           Internet:
>         FutureQuest, Inc.               Kevin at FutureQuest.net  (work)
>         Orlando, Florida                kmk at sanitarium.net (personal)
>         Web page:                       http://www.sanitarium.net/
>         PGP public key available on web site.
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2
>
> iEYEARECAAYFAlcMLaIACgkQVKC1jlbQAQcRfACgkhNohkizfd1zm502bXjX0cN9
> BnwAn1sMsWRg3er1aiynU4koDEYEiI91
> =/1Va
> -----END PGP SIGNATURE-----
>
> --
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html



More information about the rsync mailing list