rSync Issue with lots of small files

David A. Soussan das at dascomputerconsultants.com
Tue Apr 6 11:38:49 MDT 2010


Sorry about the repost. This was apparently stuck in my mail server  
and just unstuck itself when I rebooted.

On Apr 6, 2010, at 1:31 PM, "David A. Soussan" <das at dascomputerconsultants.com 
 > wrote:

> Wayne, et al:
>
> First posting, and I'm a n00b with rSync. I was having issues with  
> rSync
> from a PC (3.0.7 either via DeltaCopy or cwRsync) to a Freenas box
> (3.0.6). I've since replicated and isolated the problem attempting to
> sync PC to PC with various flavors of rSync on both sides.
>
> The executive summary is "daemon side processes are hanging up with  
> 100%
> cpu utilization when sending many small files."
>
> I can replicate this 100% of the time.
>
> What I've done:
>
> Created 2 directories with 20K files in each, 17 bytes long numbered
> 1.txt through 20000.txt. This is the source sync directory.
>
> Here was one of many iterations of copy lines, this one happened to  
> use
> DeltaCopy's version of rsync, but I can repeat it with versions of
> cwRsync as well:
>
> C:\DeltaCopy>rsync -rld -t --stats -v --timeout=360
> "/cygdrive/e/SyncTest" sassy::Sync
>
> System named Sassy is the target machine.
>
> Here is one of his lines when server side started by hand:
>
> C:\Program Files\ICW\Bin>rsync.exe --config ..\rsyncd.conf -vvvvvv
> --daemon --detach  -vvvvvvvv --log-file=e:\temp\rsync.txt
>
> Yes, I'm typing all those -vvvvv lines - trying to get some debug /
> verbosity out of the server side, which is where I believe the root of
> the problem is, and I'm not getting much in the way of debugging clues
> from the server end.
>
> I did some analysis of the network data with Wireshark on the client
> side.
>
> Started @ 12 seconds into the capture
> @ 13 seconds, started sending the file list.
> @ 27 seconds, file data starts flowing
> @ 41 seconds, I'm seeing the first TCP ZeroWindow on the sender, which
> frees up and stops a few times between this and the next line as the
> socket is full or empties on the server side.
> @ 122 seconds, the zero window happens and this time it no longer  
> frees
> up.
>
> The client's "-v" output screen shows it is sending file 13159.txt and
> is frozen until the timeout or I hit ^C.
>
> Meanwhile, back at the server side ...
>
> I'm monitoring the rsync server with ProcessMonitor from
> SysInternals.com (now Microsoft), so I can see all the file opens,
> reads, writes, etc.
>
> The last file written was 10819.txt, so he wrote 913 total files. Next
> time I'll zero-fill my file names so the counts are accurate. Sorted  
> in
> order, they go 1.txt, 10.txt, 100.txt, 1000.txt, 10000.txt, 10001.txt,
> ..., but that is why 10819 really means 913 files.
>
> The process on the server side is spinning and taking up as much CPU  
> as
> it can. I tried forcing affinity in case this was a deadlock type  
> issue,
> but that didn't fix the problem. I think one version of rSync had both
> spinning, and the one I'm looking at right now had only one of the
> processes spinning, but I can't say for sure.
>
> I can replicate this failure anytime and believe the lab setup
> replicates the failure in the production network. --bwlimit=<various
> numbers> did not improve the situation. The lowest was --bwlimit=10  
> with
> no change other than where the hang happens.
>
> Lab has been tried at 100 Mb/s, 1 Gb/s, and I'm very confident I don't
> have cable / network / switch issues as I'm seeing the same failure in
> both locations with compeltely different hardware in each world.
>
> I can get a stack trace from ProcessExplorer, and all I can tell is  
> the
> thread is "WaitForSingleObject" but not much else.
>
> Stack trace (from Process Explorer) of the spinning thread:
>
> ntkrnlpa.exe+0x8db2e
> ntkrnlpa.exe+0x29a82
> ntkrnlpa.exe+0x331a4
> ntkrnlpa.exe+0x312a9
> ntkrnlpa.exe+0x28c73
> ntkrnlpa.exe+0x29c72
> ntkrnlpa.exe+0x138d10
> ntkrnlpa.exe+0x897cc
> ntdll.dll!KiFastSystemCallRet
> kernel32.dll!WaitForSingleObject+0x12
> cygwin1.dll!cygwin_internal+0x25f3d
>
> I'd get more debug out of the server side if the -vvvv options would
> work, but they don't seem to on the server side nor can I find a way  
> to
> have the client pass more -vvvv onto the server and get more debug.
>
> Timeout or ^C will eventually kill the client, and the server's child
> processes do stop when I kill the client, but the logs on the server
> aren't chatty:
>
> Rsync.log on server:
>
> [--snip--]
> 2010/03/25 13:03:59 [3912] connect from uber.dascc2.local
> (192.168.2.148)
> [I sopped the server with ^C]
> 2010/03/25 13:23:07 [5500] [Receiver] _exit_cleanup(code=20,
> file=rsync.c, line=543): entered
> 2010/03/25 13:23:07 [5500] rsync error: received SIGINT, SIGTERM, or
> SIGHUP (code 20) at rsync.c(543) [Receiver=3.0.7]
> 2010/03/25 13:23:07 [5500] [Receiver] _exit_cleanup(code=20,
> file=rsync.c, line=543): about to call exit(20)
>
> Rsyncd.log on server:
> [--snip--]
> 2010/03/25 13:05:35 [3912] recv uber.dascc2.local [192.168.2.148] Sync
> () SyncTest/11774.txt 17
> 2010/03/25 13:05:36 [3912] recv uber.dascc2.local [192.168.2.148] Sync
> () SyncTest/11775.txt 17
> 2010/03/25 13:05:36 [3912] recv uber.dascc2.local [192.168.2.148] Sync
> () SyncTest/11776.txt 17
> [I stopped the client]
> 2010/03/25 13:22:39 [3912] rsync: writefd_unbuffered failed to write
> 4092 bytes to socket [generator]: Connection reset by peer (104)
> 2010/03/25 13:22:39 [3912] rsync error: error in rsync protocol data
> stream (code 12) at io.c(1539) [generator=3.0.7]
>
>
> I looked for this issue in the current bug list and found nothing.
>
> Short of setting up a development environment and compiling all the
> source to an exe so I can run the server side in the debugger, can you
> tell me what I might do to get more debug out of the server side and
> thus be able to give more details on this issue?
>
> Thanks!
> -----
> David Soussan
> das at dascomputerconsultants.com
> -- 
> Please use reply-all for most replies to avoid omitting the mailing  
> list.
> To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


More information about the rsync mailing list