[PATCH] vfs_recycle: printf-style templates + more

Peter Stamfest peter at stamfest.at
Sat Jan 28 18:41:48 GMT 2006


Hi List,


Here is a patch that adds some features to the vfs_recycle module I
wanted for a long time (I'm using a similar patch since 3.0.10):

  - Flexible template for the naming of files moved into the recycle
    bin if versioning is turned on. This is intended to be used to
    allow for simpler human parsing of directory contents than the
    default file name for files moved into the recycle directory.

  - In-tree recycle directories (A .recycle within the directory a file
    gets deleted from)

  - Maximum number of tries to find a proper new name. This helps to
    avoid problems with improper templates and, more importantly,
    excessive tries to find a name and excessive numbers of versions in
    the recycle bin.

For this, the patch introduces some new configuration settings for
vfs_recycle:

  - "template": string. Takes a printf-style format string with up to
    four arguments (aka "conversion specifications"). If possible on
    the underlying libc (it works on glibc - a configure check for this
    is included in the patch), the format gets checked for proper
    format types.

    The following four arguments are available to the template on the
    indicated positions:

 	1 integer - the "version" of the recycle copy
 	2 string  - the file name up, but excluding, the first dot
 	3 string - the file name from the first dot (including) up to
 	           the last dot (excluding) - dubbed the "middle
 	           extension(s)"
 	4 string - the file name from the last dot (including) up to
 		   the end

    What this means can really only be understood if you know about
    reorderable arguments in printf-style format strings: Every
    conversion specification can indicate which argument to the *printf
    call it should output by specifying the argument number behind the
    leading percent-sign in the format <1-based-index>$.

    The given arguments thus allow many possibilities. See the
    following examples for an example filename of "abc.tar.gz" and an
    assumed first copy (that is, when a file with the same name gets
    deleted again from the same directory):

    smb.conf:
         recycle:versions = Yes
         recycle:template = <template>

    <template>			result
    --------------------------------------------------------------
    Copy #%d of %s%s%s		Copy #1 of abc.tar.gz
    Copy #%1$d of %2$s%3$s%4$s	Copy #1 of abc.tar.gz
    %2$s%3$s%4$s;%1$d		abc.tar.gz;1
    %2$s%3$s%4$s;%1$03d		abc.tar.gz;001
    %2$s%3$s-%1$03d%4$s		abc.tar-001.gz
    %2$s-%1$03d%3$s%4$s		abc-001.tar.gz

    The first two examples show the current default.

    NOTE: On architectures _without a parse_printf_format function_, an
    improperly set template string will most likely lead to crashes and in
    the worst case may result in a buffer-overflow leading to a remote root
    compromise. So great care must be taken to get the template string
    right. A correctly set template string _will not_ lead to such
    problems.

  - "intree": boolean. If set to a true value the parameter named as
    the repository gets appended to the directory the deleted file
    resides in. (Note: for this to work, the keeptree flag must be
    false)

    Example:

       With these settings:
         recycle:intree = Yes
         recycle:repository = .recycle

       The file /share-root/path/to/file-to-be-deleted.ext will be
       moved into the directory /share-root/path/to/.recycle/ instead
       of /share-root/.recycle

  - "maxiter": integer - Indicates the number of iterations to try when
    searching for a new name for a file to be moved into the recycle
    bin. The iteration counter (which corresponds to the first argument
    for the template) runs from 1 to the maxiter value. A value of 0
    means no limit - this is the default for backward compatibility.

    The implementation of this feature is not overly clever - once the
    maximum number has been reached the newly deleted file deletes the
    last-but-one version deleted. This is actually a Good Thing(r). We
    should not just delete the file when the user expects the recycle bin to
    work...


Here is the patch against 3.0.21a. It would be great to get this included.


diff -ur org/samba-3.0.21a/source/configure.in samba-3.0.21a/source/configure.in
--- org/samba-3.0.21a/source/configure.in	Wed Dec 14 13:45:51 2005
+++ samba-3.0.21a/source/configure.in	Sat Jan 28 08:12:25 2006
@@ -746,6 +746,7 @@
  AC_CHECK_HEADERS(sys/sysmacros.h security/_pam_macros.h dlfcn.h)
  AC_CHECK_HEADERS(sys/syslog.h syslog.h execinfo.h)
  AC_CHECK_HEADERS(langinfo.h locale.h)
+AC_CHECK_HEADERS(printf.h)

  AC_CHECK_HEADERS(rpcsvc/yp_prot.h,,,[[
  #if HAVE_RPC_RPC_H
@@ -1959,6 +1960,29 @@
  fi
  # end utmp details

+AC_CACHE_CHECK([for reorderable format specifiers],samba_cv_have_reorderable_format,[
+AC_TRY_RUN([
+#include <stdio.h>
+#ifdef HAVE_STRING_H
+#include <string.h>
+#endif
+
+int
+main ()
+{
+    char c[80] = "";
+    sprintf(c, "%2\$d %1\$s", "1", 2);
+    return strcmp(c, "2 1") == 0 ? 0 : 1;
+}
+],
+samba_cv_have_reorderable_format=yes,samba_cv_have_reorderable_format=no,samba_cv_have_reorderable_format=cross)
+])
+
+if test x"$samba_cv_have_reorderable_format" = x"yes"; then
+    AC_DEFINE(HAVE_REORDERABLE_FORMAT,1,[Whether the host libc supports reorderable format specifiers])
+fi
+
+AC_CHECK_FUNCS(parse_printf_format)

  ICONV_LOCATION=standard
  LOOK_DIRS="/usr /usr/local /sw /opt"
diff -ur org/samba-3.0.21a/source/modules/vfs_recycle.c samba-3.0.21a/source/modules/vfs_recycle.c
--- org/samba-3.0.21a/source/modules/vfs_recycle.c	Tue Dec 20 16:28:38 2005
+++ samba-3.0.21a/source/modules/vfs_recycle.c	Sat Jan 28 16:23:41 2006
@@ -7,6 +7,8 @@
   * Copyright (C) 2002, Juergen Hasch - added some options.
   * Copyright (C) 2002, Simo Sorce
   * Copyright (C) 2002, Stefan (metze) Metzmacher
+ * Copyright (C) 2005-2006, Peter Stamfest - added template support and 
+ *                     per-directory recycle bins
   *
   * This program is free software; you can redistribute it and/or modify
   * it under the terms of the GNU General Public License as published by
@@ -24,6 +26,11 @@
   */

  #include "includes.h"
+#ifdef HAVE_PARSE_PRINTF_FORMAT
+#ifdef HAVE_PRINTF_H
+#include "printf.h"
+#endif
+#endif

  #define ALLOC_CHECK(ptr, label) do { if ((ptr) == NULL) { DEBUG(0, ("recycle.bin: out of memory!\n")); errno = ENOMEM; goto label; } } while(0)

@@ -32,6 +39,13 @@
  #undef DBGC_CLASS
  #define DBGC_CLASS vfs_recycle_debug_level

+/* PSt: This is actually redundant - keep it as an example */
+#ifdef HAVE_REORDERABLE_FORMAT
+#define DEFAULT_TEMPLATE "Copy #%1$d of %2$s%3$s%4$s"
+#else
+#define DEFAULT_TEMPLATE "Copy #%d of %s%s%s"
+#endif
+
  static int recycle_connect(vfs_handle_struct *handle, connection_struct *conn, const char *service, const char *user);
  static void recycle_disconnect(vfs_handle_struct *handle, connection_struct *conn);
  static int recycle_unlink(vfs_handle_struct *handle, connection_struct *conn, const char *name);
@@ -76,6 +90,57 @@
  	return tmp_str;
  }

+
+static const char *recycle_template(vfs_handle_struct *handle)
+{
+	const char *template_str = NULL;
+	size_t r;
+	int argtypes[6]; /* at least one more than the number of
+			    format specifiers we support */
+
+	template_str = lp_parm_const_string(SNUM(handle->conn), "recycle", "template", DEFAULT_TEMPLATE);
+
+	DEBUG(0, ("recycle: template = %s\n", template_str));
+
+#ifdef HAVE_PARSE_PRINTF_FORMAT
+	r = parse_printf_format(template_str, sizeof(argtypes) / sizeof(argtypes[0]),
+			        argtypes);
+	if ((r > 0 && argtypes[0] != PA_INT) ||
+	    (r > 1 && argtypes[1] != PA_STRING) ||
+	    (r > 2 && argtypes[2] != PA_STRING) ||
+	    (r > 3 && argtypes[3] != PA_STRING) ||
+	    (r > 4)) {
+		DEBUG(0, ("recycle: invalid template (more than 4 arguments or wrong argument types in printf-style format string '%s' - using default '%s')\n",
+			   template_str, DEFAULT_TEMPLATE));
+		return DEFAULT_TEMPLATE;
+	}
+#endif
+ 
+	return template_str;
+}
+
+static BOOL recycle_in_tree(vfs_handle_struct *handle)
+{
+	BOOL ret;
+ 
+	ret = lp_parm_bool(SNUM(handle->conn), "recycle", "intree", False);
+
+	DEBUG(10, ("recycle_bin: intree = %s\n", ret?"True":"False"));
+ 
+	return ret;
+}
+
+static int recycle_maxiter(vfs_handle_struct *handle)
+{
+	int maxiter;
+ 
+	maxiter = lp_parm_int(SNUM(handle->conn), "recycle", "maxiter", 0);
+
+	DEBUG(10, ("recycle: maxiter = %d\n", maxiter));
+ 
+	return maxiter;
+}
+
  static BOOL recycle_keep_dir_tree(vfs_handle_struct *handle)
  {
  	BOOL ret;
@@ -181,6 +246,7 @@
  	return dirmode;
  }

+
  static BOOL recycle_directory_exist(vfs_handle_struct *handle, const char *dname)
  {
  	SMB_STRUCT_STAT st;
@@ -357,12 +423,18 @@
  	char *path_name = NULL;
         	char *temp_name = NULL;
  	char *final_name = NULL;
-	const char *base;
+	const char *base = NULL;
+	char *basebase = NULL;
+	char *midext = NULL;
+	char *finext = NULL;
+	char *c;
  	char *repository = NULL;
+	const char *template_str = NULL;
  	int i = 1;
  	int maxsize;
+	int maxiter;
  	SMB_OFF_T file_size; /* space_avail;	*/
-	BOOL exist;
+	BOOL exist, flag;
  	int rc = -1;

  	repository = alloc_sub_conn(conn, recycle_repository(handle));
@@ -377,13 +449,6 @@
  		goto done;
  	}

-	/* we don't recycle the recycle bin... */
-	if (strncmp(file_name, repository, strlen(repository)) == 0) {
-		DEBUG(3, ("recycle: File is within recycling bin, unlinking ...\n"));
-		rc = SMB_VFS_NEXT_UNLINK(handle, conn, file_name);
-		goto done;
-	}
-
  	file_size = recycle_get_file_size(handle, file_name);
  	/* it is wrong to purge filenames only because they are empty imho
  	 *   --- simo
@@ -418,10 +483,11 @@
  	}
  	 */

-	/* extract filename and path */
+	/* extract filename, path and other parts */
  	base = strrchr(file_name, '/');
  	if (base == NULL) {
-		base = file_name;
+		basebase = SMB_STRDUP(file_name);
+		ALLOC_CHECK(basebase, done);
  		path_name = SMB_STRDUP("/");
  		ALLOC_CHECK(path_name, done);
  	}
@@ -430,6 +496,8 @@
  		ALLOC_CHECK(path_name, done);
  		path_name[base - file_name] = '\0';
  		base++;
+		basebase = SMB_STRDUP(base);
+		ALLOC_CHECK(basebase, done);
  	}

  	DEBUG(10, ("recycle: fname = %s\n", file_name));	/* original filename with path */
@@ -452,13 +520,92 @@
  		goto done;
  	}

+	/* we don't recycle the recycle bin... */
+
+	flag = False;
+	if (recycle_in_tree(handle) == True) {
+		/* With in-tree recycling, the repository path would
+		   be a sub-path of the entire path_name. This can
+		   mean several things:
+
+		   - repository is equal to the path_name
+		   - repository matches at the beginning of path_name
+		   - repository matches at the end of path_name
+		   - repository matches somewhere in the middle of path_name
+
+		   For the last three cases we must also check that a
+		   directory separator comes immediately before and/or after
+		   the matched portion of path_name.
+
+		   NOTE: We have to do this in a loop, as it could be
+		   the case that we have several matches for the same
+		   repository, but for the first match the additional
+		   checks would not be true.
+
+		   Example: repository = ".recycle"
+ 
+			bla/a.recycler/blub/xy -> no recycle bin
+			bla/a.recycler/blub/.recycle/xy -> recycle bin
+ 
+		   FIXME: I doubt that this works for Multi-Byte characters
+		*/
+
+		c = path_name;
+		while (c != NULL) {
+			char *d;
+
+			c = strstr(c, repository);
+			if (c == NULL) {
+				break;
+			}
+			d = c + strlen(repository);
+			if ( ((c == path_name) || ((c != path_name) && *(c - 1) == '/')) &&
+			     (*d == '/' || *d == 0)) {
+				flag = True;
+				break;
+			}
+			c++;
+		}
+	} else {
+		if (strncmp(path_name, repository, strlen(repository)) == 0) {
+			c = path_name + strlen(repository);
+			flag = (*c == 0 || *c == '/');
+		}
+	}
+
+	if (flag) {
+		DEBUG(3, ("recycle: File is within recycling bin, unlinking ...\n"));
+		rc = SMB_VFS_NEXT_UNLINK(handle, conn, file_name);
+		goto done;
+	}
+
  	if (recycle_keep_dir_tree(handle) == True) {
  		asprintf(&temp_name, "%s/%s", repository, path_name);
+	} else if (recycle_in_tree(handle) == True) {
+		asprintf(&temp_name, "%s/%s", path_name, repository);
  	} else {
  		temp_name = SMB_STRDUP(repository);
  	}
  	ALLOC_CHECK(temp_name, done);

+	c = strrchr(basebase, '.');
+	finext = SMB_STRDUP(c == NULL ? "" : c);
+	ALLOC_CHECK(finext, done);
+	if (c) {
+		*c = 0;	/* this actually sets the end if midext */
+	}
+
+	c = strchr(basebase, '.');
+	midext = SMB_STRDUP(c == NULL ? "" : c);
+	ALLOC_CHECK(midext, done);
+	if (c) {
+		*c = 0; /* this finally sets basebase */
+	}
+
+	DEBUG(10, ("recycle: basebase = %s\n", basebase));	/* base without extensions */
+	DEBUG(10, ("recycle: midext = %s\n", midext));		/* middle extension(s) */
+	DEBUG(10, ("recycle: finext = %s\n", finext));		/* last extension */
+
  	exist = recycle_directory_exist(handle, temp_name);
  	if (exist) {
  		DEBUG(10, ("recycle: Directory already exists\n"));
@@ -485,11 +632,32 @@
  		}
  	}

+	template_str = recycle_template(handle);
+
  	/* rename file we move to recycle bin */
  	i = 1;
-	while (recycle_file_exist(handle, final_name)) {
+	maxiter = recycle_maxiter(handle);
+	while (recycle_file_exist(handle, final_name) && 
+	       (maxiter == 0 || i <= maxiter)) {
+		char *t = NULL;
+
+		/* variable  position	format/type	meaning */
+		/* i		1	integer		running number */
+		/* basebase	2	string		The base name of the deleted file (excluding extension) */
+		/* ext		3	string		The part including and behind the last dot in the base name */
+		/* allext	4	string		The part including and behind the first dot in the base name */
+
+		/* NOTE: If more or other argument-types get
+		   introduced, then recycle_template must also be
+		   changed to check for proper types and number of
+		   arguments */
+		asprintf(&t, template_str, i++, basebase, midext, finext);
+		ALLOC_CHECK(t, done);
+
  		SAFE_FREE(final_name);
-		asprintf(&final_name, "%s/Copy #%d of %s", temp_name, i++, base);
+		asprintf(&final_name, "%s/%s", temp_name, t);
+		SAFE_FREE(t);
+		ALLOC_CHECK(final_name, done);
  	}

  	DEBUG(10, ("recycle: Moving %s to %s\n", file_name, final_name));
@@ -505,6 +673,9 @@
  		recycle_do_touch(handle, final_name, recycle_touch_mtime(handle));

  done:
+	SAFE_FREE(finext);
+	SAFE_FREE(midext);
+	SAFE_FREE(basebase);
  	SAFE_FREE(path_name);
  	SAFE_FREE(temp_name);
  	SAFE_FREE(final_name);



peter


More information about the samba-technical mailing list