rsync stops at system call select()

jian li lijian06nju at gmail.com
Fri Nov 12 19:51:05 MST 2010


Thanks for the help from Mr K S Braunsdorf .
This problem is caused by the opened pipe, which should have been closed, in
the c++ program.


thanks
James Li


2010/11/11 jian li <lijian06nju at gmail.com>

> hi, all:
>        I need to back up about 50 files, the size of which won't exceed 5m,
> every 10~15 minutes to four remote machines.
>
>        The back up command is written in a shell script file and was
> executed by the scheduling program with system() function. The scheduling
> program is implemented with c++.
>        The command as follow:
>          *rsync -az  /home/admin/service/* admin at 10.249.49.101:
> /news_hot_data*
>
>       * At first, this works just fine.* But after about one or two days,
> rsync will stop at some place  and the whole backup process stuck.
>        The following is the output with -vv option when backup stop:
>          *opening connection using: ssh -l admin 10.249.49.101 rsync
> --server -vvlogDtprze.isf . /news_hot_data*
>
>        I used strace to track the system call, and I found select() was
> invoked again and again, never end until the program was killed by ctrl+c.
> The following is the output:
>           ......
>           18477 select(1027, [255 1024], [], NULL, NULL) = 1 (in
> [1024])
>
>           18477 read(1024, "\36\0\0\0", 16384)    = 4
>           18477 select(1027, [255 1024], [255], NULL, NULL) = 1 (out [255])
>           18477 write(255,
> "]\306\304\2315\r\346\314\26]\2\275\350|X\305X\216\361\"\301}\t\34\213\357GPS\360\214\370"...,
> 48) = 48
>           18477 select(1027, [255 1024], [], NULL, NULL) = 1 (in [255])
>           18477 read(255, "l\210\377V\20\270\0270\276@\363N\366\n!\311\211\312\206\216\25\3\1\323\375\370\24\0174lM\312"...,
> 8192) = 48
>           18477 select(1027, [255 1024], [1025], NULL, NULL) = 1 (out
> [1025])
>           18477 write(1025, "\35\0\0\0\335\226\333L", 8) = 8
>           18477 select(1027, [255 1024], [], NULL, NULL <unfinished
> ...>
>           18476 <... select resumed> )            = 0 (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0
> (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0
> (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0
> (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0
> (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0
> (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0
> (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0
> (Timeout)
>           18476 select(1026, [], [], NULL, {60, 0}) = 0
> (Timeout)
>            ......
>            I read the linux manual about select(), know it was used to wait
> something to be ready, but I don't know what exactly it is waiting, and it
> never will ready.
>            *And what made me more confused is that when I execute the
> script in the terminal, it still works!*
>
>            I have been confused for a few days and try to find out the
> reason by google, but failed.
>
>
> thanks
> James li
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20101113/d6692522/attachment.html>


More information about the rsync mailing list