[Bug 2654] timeout is always triggered with 2.6.4

Wed Apr 27 09:54:42 GMT 2005

https://bugzilla.samba.org/show_bug.cgi?id=2654

------- Additional Comments From f+samba at congenio.de  2005-04-27 02:54 -------
(In reply to comment #1)
> The generator's setting of the ignore_timeout variable is what allows the I/O
> code to continue updating last_io (the old method of clearing io_timeout in the
> generator used to leave last_io unchanged, which is no longer desireable).
> 

Yup, I was wrong with my first guess. I tried modifying the source to reset
last_io when ignore_timeout is set to 1 and the error did not go away. last_io
DOES get modified, but there is more than one last_io, as I found out.

> How short is your timeout?  Are you using a delete option?  I tried some tests,
> and couldn't duplicat the problem, so I'll need more details.

The timeout was big enough (3600), there was no delete, but a fuzzy option
specified. The files transferred were huge (800 MByte over a 512 KBit/s link)
and had been changed only in the last portion (data was appended). The timeout
did not matter, BTW, when I used 300 (and 1200) seconds, the timeout occured
shortly after that time, too.

To isolate the problem, I added rprintf() to the statements were last_io is
being modified. Since there are always two rsync client processes, which hold an
instance of last_io each, it looks that there were long sequences of
read_timeout() activity which were not interspersed by writefd_unbuffered() with
my kind of files.

I could see that last_io was set in read_timeout() over and over again and that
when check_timeout() was getting called, its instance of last_io (which
definitely had been set by writefd_unbuffered, as I could glean from the value)
was outdated.

I think this has to do with these statements, which I don't completely understand:

read_timeout():

                if (io_timeout && fd == sock_f_in)
                        last_io = time(NULL);

writefd_unbuffered():

                if (fd == sock_f_out) {
                        if (io_timeout)
                                last_io = time(NULL);
                        sleep_for_bwlimit(ret);
                }

Each "side" (i.e. process) seems to watch only one direction of the data flow?
It seems since both sides are separated, too long pauses can arise in special
situations (such as mine) because data flows unidirectionally into the
"unwatched" direction for the specific process side and its last_io instance.

-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.