100% CPU load

Rusty Russell rusty at rustcorp.com.au
Mon Dec 21 01:53:02 MST 2009


On Sat, 19 Dec 2009 02:49:51 am Stefan (metze) Metzmacher wrote:
> Hi Ronnie,
> 
> I've found another problem with ctdbd spinning
> with 100% CPU load reading 0 bytes from the log child.
> 
> ctdb_log_handler()
> 
> gets n = 0 from read and log is log_state.
> 
> I think it's wrong that we don't test for n <= 0 and not error out.
> 
> I don't understand the code enough to fix it for the log == log_state.

This is connected to stdout and stderr, so it should *never* be closed.
Hence Ronnie's patch description:

commit bcf494b81f4277dc75f05faccf0c446bd15f6e2b
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Tue Dec 15 19:04:52 2009 +1100

    This is a dodgy patch.
    
    I saw once where the master ctdbd logging structure was talloc freed
    which caused issues.
    So only free the structure if it is NOT the master structure.
    
    This needs to be looked into in more detail.

I've added a hack to log closing of stdout/stderr, and I don't hit anything in
normal running (the parent closes them to set up logging, that's it):

commit 6bcf19397e33cc7e4c63bdd3688003ae3a3add5c
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Mon Dec 21 15:34:28 2009 +1030

    patch who-is-closing-stderr.patch

diff --git a/lib/replace/replace.c b/lib/replace/replace.c
index cec158b..3b8dac8 100644
--- a/lib/replace/replace.c
+++ b/lib/replace/replace.c
@@ -621,3 +621,25 @@ int rep_socketpair(int d, int type, int protocol, int sv[2])
 	return pipe(sv);
 }
 #endif
+
+#undef close
+int _close(int fd)
+{
+	if (fd == 0 || fd == 1 || fd == 2) {
+		static int errorfd = -1;
+		char msg[100];
+		if (errorfd == -1) {
+			int lowfd = open("/tmp/close-errors.log",
+					 O_WRONLY|O_APPEND|O_CREAT, 0600);
+			if (lowfd < 0)
+				abort();
+			if (dup2(lowfd, 1000) < 0)
+				abort();
+			close(lowfd);
+			errorfd = 1000;
+		}
+		sprintf(msg, "Warning: %i closing fd %i!\n", getpid(), fd);
+		write(errorfd, msg, strlen(msg));
+	}
+	return close(fd);
+}
diff --git a/lib/replace/replace.h b/lib/replace/replace.h
index f8a89a7..ad471c4 100644
--- a/lib/replace/replace.h
+++ b/lib/replace/replace.h
@@ -546,4 +546,6 @@ typedef int bool;
 #define QSORT_CAST (int (*)(const void *, const void *))
 #endif
 
+#define close(x) _close(x)
+
 #endif /* _LIBREPLACE_REPLACE_H */


More information about the samba-technical mailing list