[distcc] patch for distcc linux2.2 FIN_WAIT1 kernel bug?

Martin Pool mbp at samba.org
Mon Oct 21 11:03:00 GMT 2002


The attached patch is supposed to fix the problem with sockets getting
hung in FIN_WAIT1 when using distcc on Linux 2.2.x kernels.  

I'd very much appreciate it if anyone who can reproduce the problem
could try applying this kernel patch and letting me know how it
goes.

Thanks,
-- 
Martin 

----- Forwarded message from James Morris <jmorris at intercode.com.au> -----

From: James Morris <jmorris at intercode.com.au>
Subject: Re: FIN_WAIT1 / TCP_CORK / 2.2 -- reproducible bug and test case
Date: Fri, 11 Oct 2002 02:36:41 +1000 (EST)
To: Martin Pool <mbp at samba.org>
Cc: netdev at oss.sgi.com

Martin,

Below is a simplified version of your patch, which maintains the 
(skb->len < mss_now) check per Alexey's comments.

Note that I've not been able to reproduce the problem with network
configurations from 10-1000Gbps and various combinations of hosts, kernels
and 'server' applications.  I did run into what looked like stalled
ESTABLISHED/FIN_WAIT1 connections, although these turned out to be the
(valid) result of the server advertising zero windows after it ran out of
resources.  The FIN_WAIT1 states on the client were running timers and
performing zero window probes.  A further check showed that it didn't
matter if the client socket sockets were corked or not.

Would you please give this patch some testing?  (It would be really great
if any of the RH 6.2 distcc users who reported problems could test it 
also).


Thanks,

- James
-- 
James Morris
<jmorris at intercode.com.au>

diff -urN -X dontdiff linux-2.2.22.orig/net/ipv4/tcp_output.c linux-2.2.22.corkfix/net/ipv4/tcp_output.c
--- linux-2.2.22.orig/net/ipv4/tcp_output.c	Fri May 10 12:10:08 2002
+++ linux-2.2.22.corkfix/net/ipv4/tcp_output.c	Fri Oct 11 01:22:33 2002
@@ -766,24 +766,7 @@
 		TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_FIN;
 		TCP_SKB_CB(skb)->end_seq++;
 		tp->write_seq++;
-
-		/* Special case to avoid Nagle bogosity.  If this
-		 * segment is the last segment, and it was queued
-		 * due to Nagle/SWS-avoidance, send it out now.
-		 */
-		if(tp->send_head == skb &&
-		   !sk->nonagle &&
-		   skb->len < (tp->mss_cache >> 1) &&
-		   tp->packets_out &&
-		   !(TCP_SKB_CB(skb)->flags & TCPCB_FLAG_URG)) {
-			update_send_head(sk);
-			TCP_SKB_CB(skb)->when = tcp_time_stamp;
-			tp->snd_nxt = TCP_SKB_CB(skb)->end_seq;
-			tp->packets_out++;
-			tcp_transmit_skb(sk, skb_clone(skb, GFP_ATOMIC));
-			if(!tcp_timer_is_set(sk, TIME_RETRANS))
-				tcp_reset_xmit_timer(sk, TIME_RETRANS, tp->rto);
-		}
+		tcp_push_pending_frames(sk, tp);
 	} else {
 		/* Socket is locked, keep trying until memory is available. */
 		for (;;) {

----- End forwarded message -----



More information about the distcc mailing list