DO NOT REPLY [Bug 7757] New: with big file, rsync times out out when it should not; the sender is still responsive
samba-bugs at samba.org
samba-bugs at samba.org
Tue Oct 26 23:22:37 MDT 2010
https://bugzilla.samba.org/show_bug.cgi?id=7757
Summary: with big file, rsync times out out when it should not;
the sender is still responsive
Product: rsync
Version: 3.0.7
Platform: Other
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P3
Component: core
AssignedTo: wayned at samba.org
ReportedBy: tim.liim at alcatel-lucent.com
QAContact: rsync-qa at samba.org
Description of problem:
When sending big file (eg. > 500MB) With --timeout=xx option, the
server side of rsync timeouts out when it should not. The components
of rsync (generator or receiver or server) should exchange periodic
keepalive msg, so they will not timeout when the connection is
perfectly healthy.
Version-Release number of selected component (if applicable):
rsync-3.0.7-3.fc13.i686
How reproducible:
With --timeout=10: sometimes with 500MB file; almost always with 2GB file.
Steps to Reproduce:
1. Assume timeout=10 sec. generate the source file, eg.
i=/tmp/t55
time dd if=/dev/zero of=$i bs=1M count=1000
If it takes less than <timeout> x 3 (eg. 30 sec), increase
the file size. Bigger file better illustrates this issue.
2. j=/tmp/t56
rsync --timeout=10 localhost:$i $j
Actual results:
Saw the following errors, depending on the mode of failure.
#1
[sender] io timeout after 10 seconds -- exiting
rsync error: timeout in data send/receive (code 30) at io.c(140)
[sender=3.0.7]
#2
rsync: writefd_unbuffered failed to write 6 bytes to socket [generator]:
Broken pipe (32)
rsync error: timeout in data send/receive (code 30) at io.c(1530)
[generator=3.0.7]
rsync error: received SIGUSR1 (code 19) at main.c(1288) [receiver=3.0.7]
Expected results:
The sync should complete without error; certainly no timeout.
Additional info:
1. The issue is more prounced with two hosts of different speed,
eg. h1 (dest) is much slower than h2 (src) (eg. h1 could be an
slower machine (especially I/O bound) with LUKS; this could be
simulated by strace on h1:receiver).
i=/tmp/t55 j=/tmp/t56
h2: dd if=/dev/zero of=$i bs=1M count=1000
h1: rsync --timeout 10 h2:$i $j
2. From strace, the scenario with error msg in #2:
- generator <g> read <dest> and sends checksum to
sender <s> (--server in this case).
- <g> spawns receiver <r>
- <r> opens <dest> and <tmp> file, start copying from <dest> to <tmp>
- <s> reads <src> file and sends msgs to <r>
- when <s> finishes, it starts to wait, for (timeout/2) sec.
- Assume <r> is slower, <s> timeouts out after <timeout> sec.
- <g> probably needs to send keepalive msg to <s>.
3. strace for #1 is similar to #2:
- after <s> finishes, <r> takes more than 10 sec to finish
copying <dest> to <tmp> and then rename <tmp> <dest>,
so <s> times out.
4. The issue is somewhat reproducible on one single host. But
with strace on both <s> and <r>, it becomes harder to
reproduce, probably because strace slows both <s> and <r> down.
I need to use a much bigger file (2.5GB), and strace <r> from
beginning, but strace <s> only torwards the end, to show this
error with strace.
5. There are several existing bugs that may be related to this
unexpected timeout issue.
bug2783
Random high loads during syncs (server side) / client stream errors
rsync: connection unexpectedly closed (2761332 bytes received so far)
[generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(365)
bug5478
rsync: writefd_unbuffered failed to write 4092 bytes [sender]:
Broken pipe (32)
rsync: writefd_unbuffered failed to write 4092 bytes [sender]:
Broken pipe (32)
io timeout after 30 seconds -- exiting
rsync error: timeout in data send/receive (code 30) at io.c(239)
[sender=3.0.2]
bug5695
improve keep-alive code to handle long-running directory scans
./io.c:void maybe_send_keepalive(void)
bug6175
write last transfer status when timeout or other error happens
rsync: writefd_unbuffered failed to write 4 bytes [sender]:
Broken pipe (32)
rsync: connection unexpectedly closed (99113 bytes received so
far) [sender]
rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.5]
bug7195
timeout reached while sending checksums for very large files
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
More information about the rsync
mailing list