samba_dnsupdate timeouts (was Re: [PATCH] python indent bugfix in dns_hub.py)
Tim Beale
timbeale at catalyst.net.nz
Tue Feb 5 04:27:57 UTC 2019
Yeah, it looks like changing the process model was enough to push the CI
runners over the edge fairly reliably.
I tried out a couple of ideas: reducing the prefork workers, reordering
the tests, tweaking the timeouts, etc. But it seems like the swap
problem just manifests itself elsewhere.
Attached is a patch that splits up the autobuild job. I wasn't sure if
we needed a py2 target as well, or whether we were dropping py2 support
from 4.11.
CI links:
- https://gitlab.com/samba-team/devel/samba/pipelines/46145771 (result
looks OK, but I had forgotten to update builddirs in the patch)
- https://gitlab.com/samba-team/devel/samba/pipelines/46152211 (just
kicked it off again now)
On 2/02/19 7:42 AM, Andrew Bartlett wrote:
> On Fri, 2019-02-01 at 16:20 +0100, Stefan Metzmacher wrote:
>> Hi,
>> Can this be a really busy gitlab shared runner that is swapping?
> I was thinking something like that.
>
> Some of the shared runners are quite resource constrained, and any
> python script that imports Samba's 'samba' python package will load
> quite a bit of code due to the size of our bindings.
>
> Looking at the system-info.txt:
>
> ### free
> total used free shared buffers cach
> ed
> Mem: 3784700 685648 3099052 169564 8636 3079
> 40
> -/+ buffers/cache: 369072 3415628
> Swap: 8388600 166420 8222180
>
>
> This is after it finishes, but I would suggest that having pushed into
> swap at all it is likely that the < 4GB of ram has already been
> consumed.
>
>> Does anyone has any idea?
> I asked Joe to see if the imports could be made faster.
>
>> Maybe giving samba_dnsupdate a 60 second timeout would be a workaround?
> That could work, but I think the swap storm is likely to just hit
> something else.
>
> We can also split up samba-ad-dc-2 into samba-ad-dc-3 by putting the
> backup and restore DCs tests in their own autobuild jobs. This would
> make sn-devel more busy (more complies) but be cheap on the shared
> runners (free CI).
>
> Or we could rework selftest.pl to sort jobs by environment and shut
> down the environment after the last job that needs it (as determined by
> the dep tree we now have).
>
> How does that sound for medium and long-term plans?
>
> Finally, I did merge Tim's patch to change the default process model,
> so that will have changed the memory behaviour a lot.
>
> Andrew Bartlett
-------------- next part --------------
From 758d13c53bdee4c7ead741a2c115553fc5db284e Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Tue, 5 Feb 2019 15:17:03 +1300
Subject: [PATCH] autobuild: Split backup/restore testenvs out into separate
job
The samba-ad-dc-2 job was reaching its limits with the number of
testenvs and what the resource-limited CI machines can handle.
Samba processes were getting swapped out of memory, causing CI runs
to fail.
This patch splits the backup/restore testenv targets into a separate
autobuild job: samba-ad-dc-backup.
Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
.gitlab-ci.yml | 5 +++++
script/autobuild.py | 12 ++++++++++++
2 files changed, 17 insertions(+)
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 5cc2103..908c29e 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -67,6 +67,11 @@ build_samba_ad_dc_2:
# this one takes about 1 hours to finish
- script/autobuild.py samba-ad-dc-2 --verbose --nocleanup --keeplogs --tail --testbase /tmp/samba-testbase
+build_samba_ad_dc_backup:
+ <<: *shared_template
+ script:
+ - script/autobuild.py samba-ad-dc-backup --verbose --nocleanup --keeplogs --tail --testbase /tmp/samba-testbase
+
build_samba_ad_dc_2_py2:
<<: *shared_template
script:
diff --git a/script/autobuild.py b/script/autobuild.py
index 2ea9e55..00f0d22 100755
--- a/script/autobuild.py
+++ b/script/autobuild.py
@@ -51,6 +51,7 @@ builddirs = {
"samba-ad-dc-py2": ".",
"samba-ad-dc-2": ".",
"samba-ad-dc-2-py2": ".",
+ "samba-ad-dc-backup": ".",
"samba-systemkrb5": ".",
"samba-nopython": ".",
"samba-buildpy2-only": ".",
@@ -166,6 +167,17 @@ tasks = {
"--include-env=vampire_2000_dc "
"--include-env=fl2000dc "
"--include-env=ad_dc_no_nss "
+ "'",
+ "text/plain"),
+ ("check-clean-tree", "script/clean-source-tree.sh", "text/plain")],
+
+ # run the backup/restore testenvs separately as they're fairly standalone
+ # (and CI seems to max out at ~8 different DCs running at once)
+ "samba-ad-dc-backup": [("random-sleep", "script/random-sleep.sh 60 600", "text/plain"),
+ ("configure", "./configure.developer --with-selftest-prefix=./bin/ab" + samba_configure_params, "text/plain"),
+ ("make", "make -j", "text/plain"),
+ ("test", "make test FAIL_IMMEDIATELY=1 "
+ "TESTS='${PY3_ONLY}"
"--include-env=backupfromdc "
"--include-env=restoredc "
"--include-env=renamedc "
--
2.7.4
More information about the samba-technical
mailing list