[SCM] Samba Shared Repository - branch master updated

Noel Power npower at samba.org
Thu Oct 14 10:21:01 UTC 2021


The branch, master has been updated
       via  71cef2fa1dd docs: document new Spotlight Elasticsearch options
       via  8e3372eceab mdssvc: add options to allow ignoring attribute and type mapping errors
       via  c6743237891 mdssvc: prepare for ignore attribute and type mapping errors
       via  232146775bb selftest: add a test ignored spotlight/elasticsearch mapping failures
      from  8ab0238abd1 .gitlab-ci: Avoid duplicate CI on all merge requests

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 71cef2fa1ddbe05d29e7ce571a35888ef4663b22
Author: Ralph Boehme <slow at samba.org>
Date:   Wed Oct 13 19:16:10 2021 +0200

    docs: document new Spotlight Elasticsearch options
    
      elasticsearch:ignore unknown attribute = yes | no (default: no)
      elasticsearch:ignore unknown type = yes | no (default: no)
    
    Signed-off-by: Ralph Boehme <slow at samba.org>
    Reviewed-by: Noel Power <npower at samba.org>
    
    Autobuild-User(master): Noel Power <npower at samba.org>
    Autobuild-Date(master): Thu Oct 14 10:20:27 UTC 2021 on sn-devel-184

commit 8e3372eceab1bc7ad8ac813b12d654c979e20769
Author: Ralph Boehme <slow at samba.org>
Date:   Sat Oct 9 18:51:14 2021 +0200

    mdssvc: add options to allow ignoring attribute and type mapping errors
    
    This adds two options that are used by the Spotlight query parser to optionally
    ignore unknown attributes or types in a query.
    
        elasticsearch:ignore unknown attribute = yes | no (default: no)
        elasticsearch:ignore unknown type = yes | no (default: no)
    
    Example Spotlight query with unknown attributes and type:
    
        kMDItemContentType=="public.calendar-event"||kMDItemSubject=="Kalender*"cdw||
        kMDItemTitle=="Kalender*"cdw||kMDItemTopic=="Kalender*"cdw||
        kMDItemTextContent=="Kalender*"cd||*=="Kalender*"cdw||
        kMDItemTextContent=="Kalender*"cdw
    
    The unknown attributes are "kMDItemTopic" and "kMDItemSubject". The unkown type
    is "public.calendar-event".
    
    Currently the parser will outright fail to parse the query and the search will
    enter an error state.
    
    To give users some control over the mapping the above options can be used to
    tell the parser to simply ignore such unknown attributes and types.
    
      (meta.title:Kalender* OR content:Kalender* OR Kalender* OR content:Kalender*)
    
    Signed-off-by: Ralph Boehme <slow at samba.org>
    Reviewed-by: Noel Power <npower at samba.org>

commit c67432378910691456f1deec3d5a8a73a6080887
Author: Ralph Boehme <slow at samba.org>
Date:   Sat Oct 9 18:50:02 2021 +0200

    mdssvc: prepare for ignore attribute and type mapping errors
    
    Lower the debug levels to debug from error. No change in behaviour.
    
    Signed-off-by: Ralph Boehme <slow at samba.org>
    Reviewed-by: Noel Power <npower at samba.org>

commit 232146775bb00769a3c208441ad0fa28bfe7f42f
Author: Ralph Boehme <slow at samba.org>
Date:   Sat Oct 9 16:44:25 2021 +0200

    selftest: add a test ignored spotlight/elasticsearch mapping failures
    
    Signed-off-by: Ralph Boehme <slow at samba.org>
    Reviewed-by: Noel Power <npower at samba.org>

-----------------------------------------------------------------------

Summary of changes:
 .../misc/elasticsearchignoreunknownattribute.xml   |  19 ++++
 .../misc/elasticsearchignoreunknowntype.xml        |  19 ++++
 selftest/tests.py                                  |   6 ++
 source3/rpc_server/mdssvc/es_mapping.c             |   2 +-
 source3/rpc_server/mdssvc/es_parser.y              | 102 ++++++++++++++++-----
 source3/rpc_server/mdssvc/test_mdsparser_es.c      |  54 +++++++++++
 6 files changed, 180 insertions(+), 22 deletions(-)
 create mode 100644 docs-xml/smbdotconf/misc/elasticsearchignoreunknownattribute.xml
 create mode 100644 docs-xml/smbdotconf/misc/elasticsearchignoreunknowntype.xml


Changeset truncated at 500 lines:

diff --git a/docs-xml/smbdotconf/misc/elasticsearchignoreunknownattribute.xml b/docs-xml/smbdotconf/misc/elasticsearchignoreunknownattribute.xml
new file mode 100644
index 00000000000..86368d30e58
--- /dev/null
+++ b/docs-xml/smbdotconf/misc/elasticsearchignoreunknownattribute.xml
@@ -0,0 +1,19 @@
+<samba:parameter name="elasticsearch:ignore unknown attribute"
+                 context="G"
+                 type="boolean"
+                 xmlns:samba="http://www.samba.org/samba/DTD/samba-doc">
+  <description>
+    <para>
+      Ignore unknown Spotlight attributes in search queries. An example query
+      using the unsupported attribute
+      <literal>"kMDItemTopic"</literal> would be
+      <literal>kMDItemTopic=="hotstuff"</literal>. By
+      default any query using such a type would completely fail. By enabling
+      this option, if the type match is a subexpression of a larger expression,
+      then this subexpression is just ignored.
+    </para>
+  </description>
+
+  <value type="default">no</value>
+  <value type="example">yes</value>
+</samba:parameter>
diff --git a/docs-xml/smbdotconf/misc/elasticsearchignoreunknowntype.xml b/docs-xml/smbdotconf/misc/elasticsearchignoreunknowntype.xml
new file mode 100644
index 00000000000..ca1f873adac
--- /dev/null
+++ b/docs-xml/smbdotconf/misc/elasticsearchignoreunknowntype.xml
@@ -0,0 +1,19 @@
+<samba:parameter name="elasticsearch:ignore unknown type"
+                 context="G"
+                 type="boolean"
+                 xmlns:samba="http://www.samba.org/samba/DTD/samba-doc">
+  <description>
+    <para>
+      Ignore unknown Spotlight types in search queries. An example query using
+      the unsupported type <literal>"public.calendar-event"</literal>
+      would be
+      <literal>kMDItemContentType=="public.calendar-event"</literal>. By
+      default any query using such a type would completely fail. By enabling
+      this option, if the type match is a subexpression of a larger expression,
+      then this subexpression is just ignored.
+    </para>
+  </description>
+
+  <value type="default">no</value>
+  <value type="example">yes</value>
+</samba:parameter>
diff --git a/selftest/tests.py b/selftest/tests.py
index 128374210c4..8ff635f2777 100644
--- a/selftest/tests.py
+++ b/selftest/tests.py
@@ -435,6 +435,12 @@ plantestsuite("samba.unittests.test_oLschema2ldif", "none",
 if with_elasticsearch_backend:
     plantestsuite("samba.unittests.mdsparser_es", "none",
                   [os.path.join(bindir(), "default/source3/test_mdsparser_es")] + [configuration])
+    plantestsuite("samba.unittests.mdsparser_es_failures", "none",
+                  [os.path.join(bindir(), "default/source3/test_mdsparser_es"),
+                  " --option=elasticsearch:testmappingfailures=yes",
+                  " --option=elasticsearch:ignoreunknownattribute=yes",
+                  " --option=elasticsearch:ignoreunknowntype=yes"] +
+                  [configuration])
 plantestsuite("samba.unittests.credentials", "none",
               [os.path.join(bindir(), "default/auth/credentials/test_creds")])
 plantestsuite("samba.unittests.tsocket_bsd_addr", "none",
diff --git a/source3/rpc_server/mdssvc/es_mapping.c b/source3/rpc_server/mdssvc/es_mapping.c
index 5c71e503bf5..577fc38e912 100644
--- a/source3/rpc_server/mdssvc/es_mapping.c
+++ b/source3/rpc_server/mdssvc/es_mapping.c
@@ -179,7 +179,7 @@ struct es_attr_map *es_map_sl_attr(TALLOC_CTX *mem_ctx,
 			  "type",
 			  &typestr);
 	if (ret != 0) {
-		DBG_ERR("No JSON type mapping for [%s]\n", sl_attr);
+		DBG_DEBUG("No JSON type mapping for [%s]\n", sl_attr);
 		return NULL;
 	}
 
diff --git a/source3/rpc_server/mdssvc/es_parser.y b/source3/rpc_server/mdssvc/es_parser.y
index 764f4bc5c20..c154dd660fc 100644
--- a/source3/rpc_server/mdssvc/es_parser.y
+++ b/source3/rpc_server/mdssvc/es_parser.y
@@ -65,6 +65,9 @@
 		TALLOC_CTX *frame;
 		json_t *kmd_map;
 		json_t *mime_map;
+		bool ignore_unknown_attribute;
+		bool ignore_unknown_type;
+		bool type_error;
 		YY_BUFFER_STATE s;
 		const char *result;
 	} *global_es_parser_state;
@@ -115,23 +118,48 @@ input:
 
 line:
 expr {
+	if ($1 == NULL) {
+		YYABORT;
+	}
+	if (global_es_parser_state->type_error) {
+		YYABORT;
+	}
 	global_es_parser_state->result = $1;
 }
 ;
 
 expr:
 OBRACE expr CBRACE {
-	if ($2 == NULL) YYABORT;
-	$$ = talloc_asprintf(talloc_tos(), "(%s)", $2);
-	if ($$ == NULL) YYABORT;
+	if ($2 == NULL) {
+		$$ = NULL;
+	} else {
+		$$ = talloc_asprintf(talloc_tos(), "(%s)", $2);
+		if ($$ == NULL) YYABORT;
+	}
 }
 | expr AND expr {
-	$$ = talloc_asprintf(talloc_tos(), "(%s) AND (%s)", $1, $3);
-	if ($$ == NULL) YYABORT;
+	if ($1 == NULL && $3 == NULL) {
+		$$ = NULL;
+	} else if ($1 == NULL) {
+		$$ = $3;
+	} else if ($3 == NULL) {
+		$$ = $1;
+	} else {
+		$$ = talloc_asprintf(talloc_tos(), "(%s) AND (%s)", $1, $3);
+		if ($$ == NULL) YYABORT;
+	}
 }
 | expr OR expr {
-	$$ = talloc_asprintf(talloc_tos(), "%s OR %s", $1, $3);
-	if ($$ == NULL) YYABORT;
+	if ($1 == NULL && $3 == NULL) {
+		$$ = NULL;
+	} else if ($1 == NULL) {
+		$$ = $3;
+	} else if ($3 == NULL) {
+		$$ = $1;
+	} else {
+		$$ = talloc_asprintf(talloc_tos(), "%s OR %s", $1, $3);
+		if ($$ == NULL) YYABORT;
+	}
 }
 | match {
 	$$ = $1;
@@ -150,20 +178,32 @@ OBRACE expr CBRACE {
 
 match:
 attribute EQUAL value {
-	$$ = map_expr($1, '=', $3, NULL);
-	if ($$ == NULL) YYABORT;
+	if ($1 == NULL) {
+		$$ = NULL;
+	} else {
+		$$ = map_expr($1, '=', $3, NULL);
+	}
 }
 | attribute UNEQUAL value {
-	$$ = map_expr($1, '!', $3, NULL);
-	if ($$ == NULL) YYABORT;
+	if ($1 == NULL) {
+		$$ = NULL;
+	} else {
+		$$ = map_expr($1, '!', $3, NULL);
+	}
 }
 | attribute LT value {
-	$$ = map_expr($1, '<', $3, NULL);
-	if ($$ == NULL) YYABORT;
+	if ($1 == NULL) {
+		$$ = NULL;
+	} else {
+		$$ = map_expr($1, '<', $3, NULL);
+	}
 }
 | attribute GT value {
-	$$ = map_expr($1, '>', $3, NULL);
-	if ($$ == NULL) YYABORT;
+	if ($1 == NULL) {
+		$$ = NULL;
+	} else {
+		$$ = map_expr($1, '>', $3, NULL);
+	}
 }
 | function {
 	$$ = $1;
@@ -174,8 +214,11 @@ attribute EQUAL value {
 
 function:
 FUNC_INRANGE OBRACE attribute COMMA WORD COMMA WORD CBRACE {
-	$$ = map_expr($3, '~', $5, $7);
-	if ($$ == NULL) YYABORT;
+	if ($3 == NULL) {
+		$$ = NULL;
+	} else {
+		$$ = map_expr($3, '~', $5, $7);
+	}
 };
 
 attribute:
@@ -183,7 +226,11 @@ WORD {
 	$$ = es_map_sl_attr(global_es_parser_state->frame,
 			    global_es_parser_state->kmd_map,
 			    $1);
-	if ($$ == NULL) YYABORT;
+	if ($$ == NULL &&
+	    !global_es_parser_state->ignore_unknown_attribute)
+	{
+		YYABORT;
+	}
 };
 
 value:
@@ -247,7 +294,10 @@ static char *map_type(const struct es_attr_map *attr,
 
 	mime_type_list = es_map_sl_type(s->mime_map, val);
 	if (mime_type_list == NULL) {
-		DBG_ERR("Mapping type [%s] failed\n", val);
+		DBG_DEBUG("Mapping type [%s] failed\n", val);
+		if (!s->ignore_unknown_type) {
+			s->type_error = true;
+		}
 		return NULL;
 	}
 
@@ -553,8 +603,8 @@ static char *map_expr(const struct es_attr_map *attr,
 		break;
 	}
 	if (es == NULL) {
-		DBG_ERR("Mapping [%s %c %s (%s)] failed\n",
-			attr->name, op, val1, val2 ? val2 : "");
+		DBG_DEBUG("Mapping [%s %c %s (%s)] failed\n",
+			  attr->name, op, val1, val2 ? val2 : "");
 		return NULL;
 	}
 
@@ -603,6 +653,16 @@ bool map_spotlight_to_es_query(TALLOC_CTX *mem_ctx,
 		TALLOC_FREE(s.frame);
 		return false;
 	}
+
+	s.ignore_unknown_attribute = lp_parm_bool(GLOBAL_SECTION_SNUM,
+						  "elasticsearch",
+						  "ignore unknown attribute",
+						  false);
+	s.ignore_unknown_type = lp_parm_bool(GLOBAL_SECTION_SNUM,
+					     "elasticsearch",
+					     "ignore unknown type",
+					     false);
+
 	global_es_parser_state = &s;
 	result = mdsyylparse();
 	global_es_parser_state = NULL;
diff --git a/source3/rpc_server/mdssvc/test_mdsparser_es.c b/source3/rpc_server/mdssvc/test_mdsparser_es.c
index b5cb86db9a7..af2b8e64840 100644
--- a/source3/rpc_server/mdssvc/test_mdsparser_es.c
+++ b/source3/rpc_server/mdssvc/test_mdsparser_es.c
@@ -161,6 +161,40 @@ static struct {
 	}
 };
 
+static struct {
+	const char *mds;
+	const char *es;
+} map_ignore_failures[] = {
+	{
+		"*==\"Samba\"||foo==\"bar\"",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}, {
+		"*==\"Samba\"&&foo==\"bar\"",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}, {
+		"*==\"Samba\"||kMDItemContentType==\"666\"",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}, {
+		"*==\"Samba\"&&kMDItemContentType==\"666\"",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}, {
+		"*==\"Samba\"||foo==\"bar\"||kMDItemContentType==\"666\"",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}, {
+		"*==\"Samba\"&&foo==\"bar\"&&kMDItemContentType==\"666\"",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}, {
+		"foo==\"bar\"||kMDItemContentType==\"666\"||*==\"Samba\"||x!=\"6\"",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}, {
+		"*==\"Samba\"||InRange(foo,1,2)",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}, {
+		"*==\"Samba\"||foo==$time.iso(2018-10-01T10:00:00Z)",
+		"(Samba)" PATH_QUERY_SUBEXPR
+	}
+};
+
 static void test_mdsparser_es(void **state)
 {
 	TALLOC_CTX *frame = talloc_stackframe();
@@ -192,6 +226,26 @@ static void test_mdsparser_es(void **state)
 		assert_string_equal(es_query, map[i].es);
 	}
 
+	if (!lp_parm_bool(GLOBAL_SECTION_SNUM,
+			  "elasticsearch",
+			  "test mapping failures",
+			  false))
+	{
+		goto done;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(map_ignore_failures); i++) {
+		DBG_DEBUG("Mapping: %s\n", map_ignore_failures[i].mds);
+		ok = map_spotlight_to_es_query(frame,
+					       mappings,
+					       path_scope,
+					       map_ignore_failures[i].mds,
+					       &es_query);
+		assert_true(ok);
+		assert_string_equal(es_query, map_ignore_failures[i].es);
+	}
+
+done:
 	json_decref(mappings);
 	TALLOC_FREE(frame);
 }


-- 
Samba Shared Repository



More information about the samba-cvs mailing list