[LSF/MM TOPIC] Enhancing Copy Tools for Linux FS

Steve French smfrench at gmail.com
Mon Feb 11 17:43:58 UTC 2019


On Mon, Feb 11, 2019 at 2:32 AM Andreas Dilger <adilger at dilger.ca> wrote:
>
> On Feb 8, 2019, at 4:56 PM, Steve French <smfrench at gmail.com> wrote:
> >
> > On Fri, Feb 8, 2019 at 5:03 PM Steve French <smfrench at gmail.com> wrote:
> >>
> >> On Fri, Feb 8, 2019 at 4:37 PM Andreas Dilger <adilger at dilger.ca> wrote:
> >>>
> >>> On Feb 8, 2019, at 8:19 AM, Steve French <smfrench at gmail.com> wrote:
<snip>
> > I did some experiments changing the block size returned from 1K to 64K to 1MB
> > and see no difference in the copy size used by cp (it was always 128K in all
> > the cases when caching is disabled)

I figured out the problem - I read your note as meaning s_blocksize (which not
st_blksize), ie the block size in the superblock not on the file.

Changing st_blksize (stat->blksize) to 4MB did lead to the better performance
(and large I/O matching the block size) for uncached cp


> Strange.  I just re-tested this on Lustre, in case something had changed in
> GNU fileutils that I didn't notice, and it worked fine for me, using both
> "cp --version = 8.4" on RHEL and "cp --version = 8.26" on Ubuntu:
>
> $ dd if=/dev/urandom of=/tmp/foo bs=1M count=12
> $ strace -v cp /tmp/foo /testfs/tmp
> :
> open("/tmp/foo", O_RDONLY)              = 3
> fstat(3, {... st_blksize=4096, st_blocks=24576, st_size=12582912, ...}) = 0
> open("/testfs/tmp/foo", O_WRONLY|O_CREAT|O_EXCL, 0664) = 4
> fstat(4, { ... st_blksize=4194304, st_blocks=0, st_size=0, ...}) = 0
> read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304
> write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304
> :
>
> Note the "st_blksize=4194304" for the target file returned by Lustre matches
> the read and write buffer size used by "cp".  The same is true if Lustre is
> the source file and not the target, so it probably picks the maximum of both:
>
> open("/testfs/tmp/foo", O_RDONLY)     = 3
> fstat(3, {... st_blksize=4194304, st_blocks=24576, st_size=12582912 ...}) = 0
> open("/tmp/bar", O_WRONLY|O_TRUNC)      = 4
> fstat(4, {... st_blksize=4096, st_blocks=0, st_size=0 ...}) = 0
> read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304
> write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 4194304) = 4194304
> :
>
> Running the same command with /tmp as the target uses a smaller buffer size
> matching the "st_blocks=32768" and correspondingly more read/write calls:
>
> $ strace -v cp /tmp/foo /tmp/baz
> :
> open("/tmp/baz", O_WRONLY|O_CREAT|O_EXCL, 0664) = 4
> fstat(4, {... st_blksize=4096, st_blocks=0, st_size=0, ...}) = 0
> read(3, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 32768) = 32768
> write(4, "h\230#`\2\223\273\3423W\24\222:\2113w\327"..., 32768) = 32768
> :
>
> In this case, cp probably has some minimum buffer size it uses to avoid the
> poor performance of using 4KB blocks.

Yes - although the code is a little hard to follow it looks like 128K
in my system's version of cp (Ubuntu)


-- 
Thanks,

Steve



More information about the samba-technical mailing list