rsync proxy

Sun Aug 30 22:25:07 MDT 2009

On 31/08/2009, at 1:24 PM, Matt McCutchen wrote:

> On Wed, 2009-08-26 at 22:12 +1200, Nathan Ward wrote:
>> I'm trying to write an rsync 'proxy' of sorts. The plan is that my
>> code runs on two machines (one 'client' and one 'server') and each
>> piece of code executes a copy of rsync, and copies move in one
>> direction (server -> client).
>>
>> I have been able to run rsync on the 'server' end by calling it  
>> with --
>> server --sender and so on. On the client end I have rsync call my  
>> code
>> with -e "my_code", however I am trying to make it so that on the
>> 'client' end, I can have my code call rsync, instead of the other way
>> around.
>>
>> When I call --server on the 'client' end, rsync seems to handshake  
>> OK,
>> but I get buffer overflow errors:
>> <snip>
>> ERROR: buffer overflow in recv_rules [sender]
>> rsync error: error allocating core memory buffers (code 22) at /
>> SourceCache/rsync/rsync-35.2/rsync/util.c(121) [sender=2.6.9]
>> </snip>
>>
>> The above is sent from the 'server' to the 'client'.
>>
>> Before I go delving in to the code, is --server supposed to be used  
>> in
>> this way? I am basically attempting to join two rsync processes both
>> running --server, but only one running --sender.
>
> No, that will not work.  The rsync protocol requires one client and  
> one
> server.

Ok, I wasn't sure whether client vs. server was inferred by the  
inclusion/exclusion of the --sender parameter or not. It makes sense  
that it is not.

> See https://bugzilla.samba.org/show_bug.cgi?id=5220 for some ideas on
> how to call an rsync client from your code and get it to use your
> existing connection.

Ok, interesting.

I'm currently more or less doing what you talk about in comment #2 on  
that bug, as a stop gap. It's ideal that I can use a stock rsync. I  
think. Maybe I can include a patched one with my tool.. Then again  
it's not that important, it would make performance a little better but  
the bottleneck here is the network. Something to ponder, anyway.

>> The background here is I'm writing a backup tool and need to do a few
>> more things than rsync can do alone, but there's no point replicating
>> the stuff that rsync *can* do. I also don't want to use the rsync
>> daemon, nor do I want to have a user account that is remotely
>> accessible in order to get rsync over ssh going. Yes I know there are
>> solutions for parts of this, but I want to write this tool all the  
>> same.
>
> Indeed, there may be better solutions for the whole thing if you  
> explain
> your use case further.

Like I say, I'm writing a backup tool. The tool contains a server and  
a client, where one connects to the other and TLS happens to encrypt  
and authenticate the session. Then certain 'pre/post-backup' commands  
can be passed across, for example taking and mounting an LVM snapshot,  
flushing logs, whatever. This ability to pass some (perhaps pre- 
defined) commands across is a common feature of backup tools, and is  
obviously really useful.
Intricacies of this are still being figured out. I'm trying to get the  
basics working first.
Using ssh+sudo for the transport+commands+etc. is a bit of a kludge,  
from my POV anyway.

I'm running Bacula right now, but am looking to move towards something  
using hard linked trees, i.e. rsync's --link-dest. I'm currently doing  
a full backup each month, and various daily/weekly things from that. I  
end up burning far too much disk space and bandwidth pulling it down  
fresh each month.

--
Nathan Ward