[ccache] [RFC PATCH] Use fast (copy-on-write) copies on btrfs

Egon Alter egon.alter at gmx.net
Wed Sep 17 05:09:11 MDT 2014


Hi Tobias,

thanks for the patch!

Am Dienstag, 16. September 2014, 22:53:28 schrieb Tobias Geerinckx-Rice:
> > this is a feature request. I think it would make sense to add CoW
> > (Copy-on-
> > Write) support for ccache (instead of using dangerous hard-links)
> 
> I think so too.
> 
> > for filesystems which support it (zfs, btrfs, more to come?).
> 
> I've hacked up a simple patch to do this unconditionally on btrfs on
> Linux, silently falling back to a regular copy if the ioctl fails.
> The implementation is btrfs-specific; ZFS doesn't seem to support file-
> level CoW at the moment and there is no cross-fs kernel interface (yet).
> 
> Notes/questions:
> 
>   - It's been tested only by running a few 'make check's on both my
>     native btrfs system and an ext4 loopback image. Caveat so much
>     emptor.
> 
>   - I've done no performance testing at all. I hope to do so eventually,
>     but free time is scarce. Egon? Anyone? :-)

I did some measurements compiling the openSUSE kernel:

git clean, dropped caches

no ccache:         5391.75user 639.29system 31:26.27elapsed 319%CPU
with ccache:        887.53user 289.11system 15:07.42elapsed 129%CPU
with ccache + cow:  933.62user 318.38system 18:56.00elapsed 110%CPU

So surprisingly, ccache + cow is slower than without cow. I can only guess 
that the compilation is IO limited in both cases and that using reflink 
instead of copy doesn't reduce it significantly for some reason (many small 
files maybe?). The filesystem was using compression (lzo) btw.

Egon

>   - I really don't think this ought to be run-time configurable. It's
>     a transparent optimisation that either always works, or always falls
>     back to a full copy.
>     On non-Linux platforms, this should compile away. On Linux systems,
>     I'm reasonably assuming the cost of a failed attempt to be less than
>     that of a configuration check, without the ugliness of the latter.
> 
>   - I'm not too fond of the resulting copy_file(), what with the FIXME
>     and the goto (even if goto seems to be used liberally elsewhere).
>     IMHO, copy_file tries to be too transparent, leading to convoluted
>     code for little benefit.
>     I'd like to split it up into separate copy_{to,from}_cache that only
>     do what is needed, but I'm too afraid of having missed something.
> 
>   - So please: tell me why I'm stupid and wrong and save me some work!
> 
> Regards,
> 
> T G-R
> 
> From a1e00d82ab1eae1bc39122482b35954bf2304af0 Mon Sep 17 00:00:00 2001
> From: Tobias Geerinckx-Rice <tobias.geerinckx.rice at gmail.com>
> Date: Tue, 16 Sep 2014 20:23:14 +0200
> Subject: [RFC PATCH] Use fast (copy-on-write) copies on btrfs
> 
> Modern file systems like btrfs and ZFS support metadata-only file
> copies. These are similar to hard links, but without the risks:
> if either the source or destination files are later modified, the
> contents of the other file(s) do not change.
> 
> (Currently uses hard-coded #ifdef/#defines which should go away
> as soon as all this is properly supported in the kernel.)
> ---
>  util.c | 44 +++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 37 insertions(+), 7 deletions(-)
> 
> diff --git a/util.c b/util.c
> index 156c0be..b5e5ae8 100644
> --- a/util.c
> +++ b/util.c
> @@ -19,6 +19,7 @@
> 
>  #include "ccache.h"
> 
> +#include <sys/ioctl.h>
>  #include <zlib.h>
> 
>  #ifdef HAVE_PWD_H
> @@ -224,6 +225,26 @@ mkstemp(char *template)
>  #endif
> 
>  /*
> + * Copy the contents of src_fd to dest_fd in one efficient operation.
> + * Currently only supports btrfs on Linux, returns -1 on failure.
> + */
> +int
> +clone_file(int src_fd, int dest_fd)
> +{
> +#ifdef __linux__
> +#undef BTRFS_IOCTL_MAGIC
> +#define BTRFS_IOCTL_MAGIC 0x94
> +#undef BTRFS_IOC_CLONE
> +#define BTRFS_IOC_CLONE _IOW (BTRFS_IOCTL_MAGIC, 9, int)
> +	return ioctl(dest_fd, BTRFS_IOC_CLONE, src_fd);
> +#else	/* __linux__ */
> +	(void) src_fd;
> +	(void) dest_fd;
> +	return -1;
> +#endif	/* __linux__ */
> +}
> +
> +/*
>   * Copy src to dest, decompressing src if needed. compress_level > 0
> decides * whether dest will be compressed, and with which compression
> level. */
> @@ -260,13 +281,6 @@ copy_file(const char *src, const char *dest, int
> compress_level) goto error;
>  	}
> 
> -	gz_in = gzdopen(fd_in, "rb");
> -	if (!gz_in) {
> -		cc_log("gzdopen(src) error: %s", strerror(errno));
> -		close(fd_in);
> -		goto error;
> -	}
> -
>  	if (compress_level > 0) {
>  		/*
>  		 * A gzip file occupies at least 20 bytes, so it will always
> @@ -289,6 +303,21 @@ copy_file(const char *src, const char *dest, int
> compress_level) goto error;
>  		}
>  		gzsetparams(gz_out, compress_level, Z_DEFAULT_STRATEGY);
> +	} else {
> +		/* FIXME: this introduces unnecessary open/close overhead */
> +		if (!file_is_compressed(src)) {
> +			/* try to create a fast CoW copy first */
> +			if (clone_file(fd_in, fd_out) == 0) {
> +				goto out;
> +			}
> +		}
> +	}
> +
> +	gz_in = gzdopen(fd_in, "rb");
> +	if (!gz_in) {
> +		cc_log("gzdopen(src) error: %s", strerror(errno));
> +		close(fd_in);
> +		goto error;
>  	}
> 
>  	while ((n = gzread(gz_in, buf, sizeof(buf))) > 0) {
> @@ -335,6 +364,7 @@ copy_file(const char *src, const char *dest, int
> compress_level) return -1;
>  	}
> 
> +out:
>  	gzclose(gz_in);
>  	gz_in = NULL;
>  	if (gz_out) {



More information about the ccache mailing list