Performance and simultaneous connections over SSH

Francois Begin Francois.Begin at telus.com
Sun Aug 15 10:50:36 MDT 2010


Hi all,

I have about 15 LDAP servers that will be using Rsync over SSH to sync up their access logs to a centralized server for further processing. The estimated log volume is around 12 Gigs / days total. On the centralized server, the directory where the logs are being synced is an enterprise-grade NAS. So it looks like this:

LDAP servers --> rsync over SSH -->  NFS mount on log repo server --> NAS device

I have 4 servers currently configured and everything appeared to work fine at first. I do not have any visibility to the LDAP servers but I assume their clocks are synced. The cron is running at the same times: HH:00, HH:10, 20, 30, 40 and 50. 

I just started to notice some 'sync sputtering': Sometime, all 4 server's latest access log will have the same timestamp e.g. 09:30, while at other times I would see something like this: It is 09:35 and I have 2 servers at 09:20 and 2 at 09:30 i.e. they could not all sync themselves during the last round. 

How does rsync handle mutliple simultaneous connections over SSH. I am guessing that it is up to the log repo server to allocate 4 separate SSH sessions, and that within each of these, it will use rsync to sync up the logs. Is that correct? If so, then the only issue would be the log repo server requesting a lot of information from the NAS device at the same time, hence causing the operation to possibly fail. The actual rsync sommand looks like this:

/bin/rsync --timeout=300 --rsync-path=/usr/local/bin/rsync -avz -e "/usr/local/bin/ssh -i <key>" `find $LDAP_LOGS_LOCAL/access* -mtime -5 -type f` flmuser@$PREPARSER_SERVER:$LDAP_LOGS_REMOTE

Timeout is set to 300 and logs being checked and synced are at most 100 megs in size (they rotate once they hit 100 megs).

Cheers,

François





More information about the rsync mailing list