Server-side copy with sendfile system call

Fri May 23 19:01:03 MDT 2014

On Fri, May 23, 2014 at 5:57 AM, David Disseldorp <ddiss at suse.de> wrote:
> On Fri, 23 May 2014 14:29:00 +0200, David Disseldorp wrote:
>
>> On Fri, 23 May 2014 16:46:36 +0800, Teng-Feng Yang wrote:
>>
>> > I make the function fallback to read/write procedure when src=dest and
>> > pass the smb2.ioctl smbtorture test.
>>
>> Great!
>>
>> > However, I still can reproduce the performance issue mentioned before
>> > by the following steps.
>> >
>> > 1. Create a 10G test file by dd:
>> > dd if=/dev/zero of=zero_10G.img bs=1G count=10
>>
>> ...
>>
>> > 3. Recopy this file into the same directory from remote Windows server.
>> > From the second time we do the server-side copy, the throughput number
>> > will bouncing up and down between 120MB/s and 0MB/s repeatedly as
>> > mentioned in previous message. The copy_chunk_slow.log was gathered
>> > across this copy.
>>
>> The wire traffic looks okay, but the FSCTL_SRV_COPYCHUNK round-trip time
>> is huge for some of the requests. I've attached a copy-chunk IOPS graph,
>> that shows the stalls captured in copy_chunk_slow.log. Each of the flat
>> ~0 IOPS sections is your client waiting for the server to respond to
>> a FSCTL_SRV_COPYCHUNK request.
>>
>> Looking at the first slow IO (frame 209), the round-trip time for the
>> request is 6.64 seconds! It requests a transfer of 16MB from the src
>> to dest in 1MB chunks, and aside from the offsets it doesn't differ
>> from the request immediately prior (frame 206) that completed in 0.13
>> seconds.
>
> Actually looking again at the graph, the drop in IOPS is suspiciously
> consistent, occurring every 10-12 seconds. There's a good chance that
> it coincides with a flush of the page-cache out to disk, in which case
> you could try playing with your pdflush or IO-scheduler settings.
> Also, Jeremy was looking at converting the code-path to use asynchronous
> IO at one point. This would likely offer a significant increase in
> performance for this workload, as it'd allow both client and server to
> keep the IO pipeline full.

I'm not sure I understand your reasoning for this last claim.

Surely, even with AIO, once the page cache is full, you will stall
until some pages become free?

-- 
Regards,
Richard Sharpe
(何以解憂？唯有杜康。--曹操)