[PATCH] shadow_copy2: Allow configurable prefix for snapshots

Bryan Burgin bburgin at microsoft.com
Wed Sep 7 16:39:52 UTC 2016


Rajesh or Uri,

At Jeremy's suggestion, I'm looking into this from a Microsoft angle to see if we should make a comment of this in [MS-SMB] and or take other action.  Do either of you have any network traces that show this behavior?

TIA.

BTW: my colleague Obaid and I will be at SNIA/SDC in a few weeks.  We look forward to seeing some of you there!

Bryan

-----Original Message-----
From: samba-technical [mailto:samba-technical-bounces at lists.samba.org] On Behalf Of Rajesh Joseph
Sent: Thursday, July 21, 2016 3:54 AM
To: Uri Simchoni <uri at samba.org>
Cc: Ira Cooper <ira at wakeful.net>; samba-technical at lists.samba.org; Jeremy Allison <jra at samba.org>
Subject: Re: [PATCH] shadow_copy2: Allow configurable prefix for snapshots

On Wed, Jul 20, 2016 at 4:01 AM, Uri Simchoni <uri at samba.org> wrote:

> On 07/14/2016 10:12 AM, Rajesh Joseph wrote:
> > On Tue, Jul 12, 2016 at 7:56 PM, Michael Adam <obnox at samba.org> wrote:
> >
> >> Update:
> >>
> >> Had a longer discussion with Rajesh.
> >> The overall logic looks good to me.
> >> Rajesh will follow up with an updated patchset for (final?) review.
> >>
> >> Cheers- Michael
> >>
> >
> > I have split the patch into smaller ones and addressed Michael's
> comments.
> >
> > Thanks & Regards,
> > Rajesh
> >
>
> Sorry for taking so long to respond, been busy with stuff but that's 
> no excuse.
>
> I've gone through the first 4 patches (still need to review the tests 
> and the manpage). The splitting of the patch set into refactoring and 
> truly new code made reviewing much simpler. Good work! Some comments 
> follow.
>
> General comment 1:
> It would help if the commit message (and possibly the man page) would 
> explain why this is useful - why are we adding non-trivial 
> functionality that we have to maintain later. What snapshot system 
> would benefit from that.
>

> General comment 2:
> This module already has a mix of old-style and new-style debug 
> statements (DBG_XXX vs DEBUG(xx, ()) - My fault I guess... But since 
> there are already new-style statements, let's use them in new code, esp.
> in new functions.
>
> In [4/6]:
>
> Indentation of struct shadow_copy2_snapentry and struct 
> shadow_copy2_snaplist_info, struct shadow_copy2_private - haven't seen 
> any reference to it in a style guide so I'm going to be careful, but I 
> think we use a single space between the type and the variable.
>
>
Since there was no clear guide I chose a style which looks more readable to me.
Anyway changed it single space.


>
> > +static ssize_t shadow_copy2_saved_snapname(struct 
> > +shadow_copy2_private
> *priv,
> > +                                         struct tm *timestamp,
> > +                                         char *snap_str, size_t 
> > +len) {
> > +       ssize_t snaptime_len = -1;
> > +       struct shadow_copy2_snapentry *entry = NULL;
> > +
> > +       snaptime_len = strftime(snap_str, len, GMT_FORMAT, timestamp);
> > +       if (snaptime_len == 0) {
> > +               DEBUG(10, ("strftime failed\n"));
> > +               return -1;
> > +       }
> Should be DBG_ERR, we don't expect this to happen.
>
> > +static bool shadow_copy2_update_snaplist(struct vfs_handle_struct
> *handle,
> > +               time_t snap_time)
> > +{
> > +       int ret = -1;
> > +       bool snaplist_updated = false;
> > +       struct files_struct fsp = {0};
> > +       struct smb_filename smb_fname = {0};
> > +       double seconds = 0.0;
> > +       struct shadow_copy2_private *priv = NULL;
> > +
> > +       SMB_VFS_HANDLE_GET_DATA(handle, priv, struct
> shadow_copy2_private,
> > +                               return false);
> > +
> > +       seconds = difftime(snap_time, priv->snaps->fetch_time);
> I'm suspicious towards use of double in networking/storage-related code.
> In essence time_t can be trivially compared. We have some more opaque 
> stuff in lib/util/time.h, but that would require changing to timeval 
> or timespec, or converting on-the fly. difftime() is hardly used in 
> the project.
>
>
Unless we are storing it or putting it on wire do you see any issues in using double, especially here? I don't think it would be right to compare two time_t variables directly.
Just for my understanding what problems do you foresee in using difftime?



>
> > +       if (seconds > 0 || (priv->snaps->snaplist == NULL)) {
> > +               smb_fname.base_name = ".";
> > +               fsp.fsp_name = &smb_fname;
> > +
> > +               ret = shadow_copy2_get_shadow_copy_data(handle, &fsp,
> > +                                                       NULL, false);
> > +               if (ret == 0) {
> > +                       snaplist_updated = true;
> > +               } else {
> > +                       DEBUG(0,("Failed to get shadow copy data\n"));
> > +               }
> > +
> > +       }
> The assumption that "." is the share's root made me a little edgy.
> That's not always the case. After thinking long and hard, at least in 
> current code base, for this code path, I think it's safe to assume so 
> (because the pattern where we're not in the share's root is where we 
> CD into a directory and do some operation on "." or on a file within 
> the dir. In that case the pathname does not include the @GMT marker 
> and passes through this module, even if we're now inside a snapshot dir).
>
> We really need a more generic reliable way to retrieve the original 
> connect path, but until we do, can you at least put in a comment of 
> why it's safe, so that other module writers don't willy-nilly use this 
> as a reference to make that assumption?
>

in shadow_copy2_get_shadow_copy_data function smb_fname is only used to get the snapshot directory. If snapdirseverywhere is false then we directly use snapshot_basepath else we use the connectpath + base_path. Since here we only want the snapshot list I assumed the connectpath will have the snapshot dir.



>
> > @@ -1360,12 +1528,38 @@ static bool
> shadow_copy2_snapshot_to_gmt(vfs_handle_struct *handle,
> >
> >         fmt = config->gmt_format;
> >
> > +       /* If regex is provided then we will have to parse the
> > +        * filename which will contain both the prefix and the time
> format.
> > +        * e.g. <prefix><delimiter><time_format>*/
> > +       if (priv->snaps->regex != NULL) {
> > +               tmpstr = talloc_strdup(NULL, name);
> We prefer talloc_tos() as talloc context, even if we take care to free 
> the memory.
>
> >         if (config->use_sscanf) {
> >                 if (sscanf(name, fmt, &timestamp_long) != 1) {
> >                         DEBUG(10, ("shadow_copy2_snapshot_to_gmt: "
> >                                    "no sscanf match %s: %s\n",
> >                                    fmt, name));
> > +                       talloc_free(tmpstr);
> >                         return false;
> >                 }
> >                 timestamp_t = timestamp_long; @@ -1375,6 +1569,7 @@ 
> > static bool
> shadow_copy2_snapshot_to_gmt(vfs_handle_struct *handle,
> >                         DEBUG(10, ("shadow_copy2_snapshot_to_gmt: "
> >                                    "no match %s: %s\n",
> >                                    fmt, name));
> > +                       talloc_free(tmpstr);
> >                         return false;
> >                 }
> >                 DEBUG(10, ("shadow_copy2_snapshot_to_gmt: match %s:
> %s\n",
> > @@ -1388,6 +1583,7 @@ static bool
> shadow_copy2_snapshot_to_gmt(vfs_handle_struct *handle,
> >         }
> >
> >         strftime(gmt, gmt_len, GMT_FORMAT, &timestamp);
> > +       talloc_free(tmpstr);
> Instead of all those talloc_free(), goto the end and do one
> TALLOC_FREE() - that's the common style. There's much more 
> TALLOC_FREE() than talloc_free().
>
> > @@ -1485,8 +1684,22 @@ static int shadow_copy2_get_shadow_copy_data(
> >                 return -1;
> >         }
> >
> > -       shadow_copy2_data->num_volumes = 0;
> > -       shadow_copy2_data->labels      = NULL;
> > +       if (shadow_copy2_data != NULL) {
> > +               shadow_copy2_data->num_volumes = 0;
> > +               shadow_copy2_data->labels      = NULL;
> > +       }
> > +
> > +       SMB_VFS_HANDLE_GET_DATA(handle, priv, struct
> shadow_copy2_private,
> > +                               return -1);
> > +
> > +       if ((priv->snaps->regex != NULL) && (labels || 
> > + shadow_copy2_data
> == NULL)) {
> Why does "labels" affect the decision to refresh the snapshot list?
>

Normally shadow_copy2_get_shadow_copy_data function get called twice. First with label  = false and then label = true. In first case we just want to get the number of snapshots (or volume). In the second time we allocate space for that many labels and then call this function.
Therefore to avoid creation of the list multiple times I added this check.


> Why isn't it sufficient to look at regex as a condition for refreshing 
> the snapshot list? The shadow_copy2_data == NULL condition is 
> confusing
> - the meaning of this parameter is "this is a buffer for shadow copy 
> data". Adding meaning of "if it's NULL then refresh snapshot list 
> regardless of labels" is confusing.
>

With this change shadow_copy2_get_shadow_copy_data function is overloaded to perform two operations - first get the list of labels, and second update the snapshot list. If we just want to update the snapshot list then we do not want shadow_copy2_data and hence the check.


>
> > @@ -1507,6 +1720,21 @@ static int shadow_copy2_get_shadow_copy_data(
> >                 DEBUG(6,("shadow_copy2_get_shadow_copy_data: %s -> %s\n",
> >                          d->d_name, snapshot));
> >
> > +               if (get_snaplist) {
> > +                       /* Create a snap entry for each successful
> > +                        * pattern match */
> > +                       tmpentry = shadow_copy2_create_snapentry(priv);
> > +                       if (tmpentry == NULL) {
> > +                               DEBUG(0, ("talloc_zero() failed\n"));
> > +                               return -1;
> > +                       }
> tmp_ctx needs to be freed here (refactor with goto or add it).
>

thanks for catching this. Added the talloc_free statement and sent another patch to refactor it to use goto.


>
> > @@ -2117,6 +2354,37 @@ static int shadow_copy2_connect(struct
> vfs_handle_struct *handle,
> >                 return -1;
> >         }
> >
> > +       snapprefix = lp_parm_const_string(SNUM(handle->conn),
> > +                                      "shadow", "snapprefix",
> > +                                      NULL);
> > +       if (snapprefix != NULL) {
> > +               priv->snaps->regex = talloc_zero(priv->snaps, regex_t);
> > +               if (priv->snaps->regex == NULL) {
> > +                       DEBUG(0, ("talloc_zero() failed\n"));
> > +                       errno = ENOMEM;
> > +                       return -1;
> > +               }
> > +
> > +               /* pre-compute regex rule for matching pattern later */
> > +               ret = regcomp(priv->snaps->regex, snapprefix, 0);
> > +               if (ret) {
> > +                       DEBUG(0, ("Failed to create regex object\n"));
> > +                       return -1;
> > +               }
> > +       }
> > +
> > +       delimiter = lp_parm_const_string(SNUM(handle->conn),
> > +                                      "shadow", "delimiter",
> > +                                      "_GMT");
> > +       if (delimiter != NULL) {
> What if delimiter *is* NULL? this is an error condition.
>

We are passing default value ("_GMT") to this function therefore it will not return NULL.

Thanks & Regards,
Rajesh


More information about the samba-technical mailing list