[SCM] Samba Shared Repository - branch master updated

Andrew Bartlett abartlet at samba.org
Fri Dec 3 18:54:01 UTC 2021


The branch, master has been updated
       via  dab828f63c0 pytest/source_char: check for mixed direction text
       via  0f7e58b0e29 samba-tool domain backup: backup but do not follow symlinks
       via  697abc15ea5 samba-tool domain backup: cope better with dangling symlinks
      from  5e3df5f9ee6 smbd: s3-dsgetdcname: handle num_ips == 0

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit dab828f63c0a6bf0bb96920fd36383f6cbe43179
Author: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
Date:   Wed Nov 17 20:17:53 2021 +0000

    pytest/source_char: check for mixed direction text
    
    As pointed out in https://lwn.net/Articles/875964, forbidding bidi
    marker characters is not always going to be enough to avoid
    right-to-left vs left-to-right confusion. Consider this:
    
    $ python -c's = "b = x  # 2 * n * m"; print(s); print(s.replace("x", "א").replace("n", "ח"))'
    
    b = x  # 2 * n * m
    b = א  # 2 * ח * m
    
    Those two lines are semantically the same, with the Hebrew letters
    "א" and "ח" replacing "x" and "n". But they look like they mean
    different things.
    
    It is not enough to say we only allow these scripts (or indeed
    non-ascii) in strings and comments, as demonstrated in this example:
    
    $ python -c's = "b = \"x#\"  #  n"; print(s); print(s.replace("x", "א").replace("n", "ח"))'
    
    b = "x#"  #  n
    b = "א#"  #  ח
    
    where the second line is visually disordered but looks valid. Any series
    of neutral characters between teo RTL characters will be reversed (and
    possibly mirrored).
    
    In practice this affects one file, which is a text file for testing
    unicode normalisation.
    
    I think, for the reasons shown above, we are unlikely to see legitimate
    RTL code outside perhaps of documentation files — but if we do, we can
    add those files to the allow-list.
    
    Signed-off-by: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
    Reviewed-by: Andrew Bartlett <abartlet at samba.org>
    
    Autobuild-User(master): Andrew Bartlett <abartlet at samba.org>
    Autobuild-Date(master): Fri Dec  3 18:53:43 UTC 2021 on sn-devel-184

commit 0f7e58b0e29778711d3385adbba957c175c3bdef
Author: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
Date:   Wed Dec 1 10:20:48 2021 +1300

    samba-tool domain backup: backup but do not follow symlinks
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14918
    
    Signed-off-by: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
    Reviewed-by: Andrew Bartlett <abartlet at samba.org>

commit 697abc15ea50e9069eb483fdd734588281bae123
Author: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
Date:   Thu Nov 25 09:26:54 2021 +1300

    samba-tool domain backup: cope better with dangling symlinks
    
    Our previous behaviour was to try to os.stat() the non-existent
    target.
    
    The new code greatly improves efficiency for this little task.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14918
    
    Signed-off-by: Douglas Bagnall <douglas.bagnall at catalyst.net.nz>
    Reviewed-by: Andrew Bartlett <abartlet at samba.org>

-----------------------------------------------------------------------

Summary of changes:
 python/samba/netcmd/domain_backup.py | 10 +++++++++-
 python/samba/tests/source_chars.py   | 29 +++++++++++++++++++++++++++++
 testdata/source-chars-bidi.py        | 24 ++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 testdata/source-chars-bidi.py


Changeset truncated at 500 lines:

diff --git a/python/samba/netcmd/domain_backup.py b/python/samba/netcmd/domain_backup.py
index 81738196385..6cb0e512595 100644
--- a/python/samba/netcmd/domain_backup.py
+++ b/python/samba/netcmd/domain_backup.py
@@ -1109,6 +1109,7 @@ class cmd_domain_backup_offline(samba.netcmd.Command):
 
         # Recursively get all file paths in the backup directories
         all_files = []
+        all_stats = set()
         for backup_dir in backup_dirs:
             for (working_dir, _, filenames) in os.walk(backup_dir):
                 if working_dir.startswith(paths.sysvol):
@@ -1126,7 +1127,13 @@ class cmd_domain_backup_offline(samba.netcmd.Command):
                     # Ignore files that have already been added. This prevents
                     # duplicates if one backup dir is a subdirectory of another,
                     # or if backup dirs contain hardlinks.
-                    if any(os.path.samefile(full_path, file) for file in all_files):
+                    try:
+                        s = os.stat(full_path, follow_symlinks=False)
+                    except FileNotFoundError:
+                        logger.warning(f"{full_path} does not exist!")
+                        continue
+
+                    if (s.st_ino, s.st_dev) in all_stats:
                         continue
 
                     # Assume existing backup files are from a previous backup.
@@ -1140,6 +1147,7 @@ class cmd_domain_backup_offline(samba.netcmd.Command):
                         continue
 
                     all_files.append(full_path)
+                    all_stats.add((s.st_ino, s.st_dev))
 
         # We would prefer to open with FLG_RDONLY but then we can't
         # start a transaction which is the strong isolation we want
diff --git a/python/samba/tests/source_chars.py b/python/samba/tests/source_chars.py
index db7f131d815..093d7318cb0 100644
--- a/python/samba/tests/source_chars.py
+++ b/python/samba/tests/source_chars.py
@@ -94,6 +94,14 @@ SAFE_FORMAT_CHARS = {
     '\ufeff'
 }
 
+# These files legitimately mix left-to-right and right-to-left text.
+# In the real world mixing directions would be normal in bilingual
+# documents, but it is rare in Samba source code.
+BIDI_FILES = {
+    'source4/heimdal/lib/wind/NormalizationTest.txt',
+    'testdata/source-chars-bidi.py',
+}
+
 
 def get_git_files():
     try:
@@ -196,9 +204,15 @@ class CharacterTests(TestCase):
                 else:
                     self.fail(f"could not decode {name}: {e}")
 
+            dirs = set()
             for c in set(s):
                 if is_bad_char(c):
                     self.fail(f"{name} has potentially bad format characters!")
+                dirs.add(u.bidirectional(c))
+
+            if 'L' in dirs and 'R' in dirs:
+                if name not in BIDI_FILES:
+                    self.fail(f"{name} has LTR and RTL text ({dirs})")
 
     def test_unexpected_format_chars_do_fail(self):
         """Test the test"""
@@ -212,6 +226,21 @@ class CharacterTests(TestCase):
             bad_chars = [c for c in chars if is_bad_char(c)]
             self.assertEqual(len(bad_chars), n_bad)
 
+    def test_unexpected_bidi_fails(self):
+        """Test the test"""
+        for name in [
+                'testdata/source-chars-bidi.py'
+        ]:
+            fullname = os.path.join(ROOT, name)
+            with open(fullname) as f:
+                s = f.read()
+
+            dirs = set()
+            for c in set(s):
+                dirs.add(u.bidirectional(c))
+            self.assertIn('L', dirs)
+            self.assertIn('R', dirs)
+
 
 def check_file_text():
     """If called directly as a script, count the found characters."""
diff --git a/testdata/source-chars-bidi.py b/testdata/source-chars-bidi.py
new file mode 100644
index 00000000000..d728da503da
--- /dev/null
+++ b/testdata/source-chars-bidi.py
@@ -0,0 +1,24 @@
+# Used in samba.tests.source_chars to ensure bi-directional text is
+# caught. (make test TESTS=samba.tests.source_chars)
+
+x = א =2
+ח = n = 3
+
+a = x  # 2 * n * m
+b = א  # 2 * ח * m
+c = "x#"  #  n
+d = "א#"  #  ח
+e = f"x{x}n{n}"
+f = f"א{א}ח{ח}"
+
+print(a)
+print(b)
+print(c)
+print(d)
+print(e)
+print(f)
+
+assert a == b
+assert c == d.replace("א", "x")
+assert e[1] == f[1]
+assert e[3] == f[3]


-- 
Samba Shared Repository



More information about the samba-cvs mailing list