Direct I/O support (patches included)

Linda Walsh rsync at tlinx.org
Mon Feb 18 05:00:08 MST 2013


Hi dag, I really appreciate your working on this,
but it is really annoying hard and tedious.

_I_ am not certain about all the requirements of Direct I/O,

I.e. would have to research (goog/kernel source...etc).

It may be different on different platforms, I _vaguely_ remember
'talking'(email) with someone working on 'dd', and they were telling
me how they had to compensate for a change in the kernel which used
to handle the buffering of partial sector reads/writes for those
who did directio on a device.  Then they decided that much hand-holding
was wrong because, IMO, they basically wanted people to use the buffer
cache -- since for most people, and most things it's a good thing.

But when you move gigs-terabytes of data, not always so great.

I'm speculating on what they do with the last sector (or any short-read).

It might be that you try a full 512 (or 4096) byte read and the kernel
will return in its status to you that it only read 'x' bytes (the actual
size of the last sector -- but one thing that has to change in the code
is to actually read things in minimal sizes of 4096 (I'm saying 4096, because
to do 512 would shut out the 4096 devices, and require another update,
so might as well use the larger alignment to start with.

to a pipe?  I don't know what direct-I/O means and don't know what
the sizes that it needs.  The place for failure in my mesage
said 'unbuffered write of 4 bytes to *socket* failed.  I didn't
know sockets had a minimum I/O size, but maybe it was an error from
the other end of the *!socket!* (did a disk->disk copy, and it is going
through a socket?!?) where it wrote/read from an actual disk device.

But the message looks a bit odd.

Sorry about my example below -- I'd already replaced the --directio in
the shell script with --drop-cache -- which I'd forgotten I already
had in the script (memory for these things is completely gone!)..

But really, it did have the direct-io in the statement before I
changed it into a drop cache.

My version I'm using that I applied your patch to was from the openSuSE
12.1 source rpms.  Drop-cache is one of the patches they include, I'd
forgotten about it...There's a bunch .. I don't know if a bunch of patchs
ship with the original source to be used as wanted...they looked like a
something separate.

If you want to grab their source rpm, 
http://download.opensuse.org/source/distribution/12.1/repo/oss/suse/src/rsync-3.0.9-9.1.2.src.rpm

Was the one I looked at, applied your patch to...

Anyway, the first thing I'd want to find out is why it is writing to a socket 
for a local file copy?

It's going to be hard for direct I/O to make a difference (if it is workable,
the fact that they move a 'window' over the source emulating a memory-mapped
file isn't real helpful lining up memory with the sectors, but the minimums
we need to go for a minimum read size of 4096 (am pretty sure that we have to do 
that even on short files, and the kernel will just tell us we got less).

2nd thing -- need to make sure is to have a source of memory aligned boundaries.

I'm about 90% certain that it would have to be at least, 64-byte aligned for
a CORE2 type chip as I believe that to be the core2's cache-line size.

But the safest (but might waste some memory, would be to make sure the
buffer is sector (in this case 4096) aligned.

Say I want a 4096 byte buffer that is 4096 aligned,
I allocate 8k (probably can do 8k-32bytes or -64 bytes, but I'd be
conservative and just use 8k alloc's.

say that goes in void * buff=malloc(8*k);  To align tthat, say 'B' is buffer
size (4096), or void * aligned = &(buff+B-1) & ~(B-1)) (if I remember
my C .. embarrassment, been so long since I've done C.
Basically
&  (address of)
(buff+B-1) then using a bitwise 'and' (&)
with the 1's complement of 4096-1.  4096-1 = 11 bits set on in the low
portion of the word, to make it into a mask you'd use the complement,

Is anything I'm saying making any sense or am I sounding like a complete
nutcase?  ;-)...

The hard part is integrating the theory of direct-io into the existing
rolling-window that rsync uses.

I don't know how much benefit it will be if all the io has to go through
a socket first, nor do I know what the alignment requirements are for
a socket (might be the native word size), I'm on a 64-bit machine, so
8-bytes might be the minimum -- and that's why 4 would fail...

just a guess though...

Sorry I don't have time to rewrite rsync right now... I'd like to...
I use it and it is slow!

But it DOES do the job.

I use it to create snapshot volumes every night and takes easily 90-150 minutes.

Linda

>> cmd:
>> cd /Media && {
>>      sudo rsync --archive --hard-links --acls -xattrs --whole-file
>>      --drop-cache \
>>                          --one-file-system
>>                          --drop-cache --8-bit-output
>>                          \
>>                          --human-readable --progress
>>                          --inplace --del .
>>                          /backups/Media/.
>> }
>> ----
>>
>> first it deleted a bunch of stuff (~30 files)
>> deleting Library /
>> rsync: delete_file: unlink(MediaBack) failed: Operation not permitted (1)
>> .recycle/
>> .recycle/SDT27D6.tmp
>>           0 100%    0.00kB/s    0:00:00 (xfer#1, to-check=1006/1019)
>> .recycle/Library/
>> .recycle/Library/[Commie] Kotoura - 04 [FC6C5497].mkv
>>      32.77K   0%    0.00kB/s    0:00:00
>> rsync: writefd_unbuffered failed to write 4 bytes to socket 
>> [sender]:Broken pipe (32)
>>
>> Then a bunch more deletes...(~17)
>> Then:
>> rsync: write failed on "/backups/Media/./.recycle/Library/[Commie] 
>> Kotoura - 04 [FC6C5497].mkv": Invalid argument (22)
>> rsync: connection unexpectedly closed (57 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at 
>> io.c(605) [sender=3.0.9]
>>
>>
>> ???
> 
> Since you didn't use --direct-io, my patch was not doing anything. Since 
> you were using --drop-cache (twice!) this is not a vanilla rsync either.
> 
> What was it you were trying to do ?
> 


More information about the rsync mailing list