rsyncd.conf: "timeout=<minimal>" crazyness

Eberhard Moenkeberg emoenke at gwdg.de
Thu Jan 13 02:47:52 GMT 2005


Hi,

from time to time, in times like today where the whole world is grabbing 
SUSE-9.2 and/or debian-30r4, I really like to condemn those other anon
rsync server admins (you know, the successors of the traditional unix ftp 
server admins).

They usually have within their /etc/rsyncd.conf a line like

timeout = <very low>

because they are thinking "less" there is "better for my server health", 
fully neglecting (better: not aware of) the situation of their partner 
servers during those phases of general download hystery.

At ftp.gwdg.de, I have a load ("acceptance!", "trust"!) like never never 
ever before currently, caused by the nearly simultanous releases of 
SUSE-Linux-9.2 for both plastforms i386 and x89_64, including a DVD ISO 
for the first time (15 GB + 3.3 GB), and debian-30r4 (lots og GBs too), 
the latter with a huge amount of pure "ethical" acceptance, regardless of 
functionality. the former with additional aspects of affinity...

I guess there is no real machine at this world which is capable to hold 
all this extremely hot stuff in buffercache at once, so all the servers 
serving SUSE plus DEBIAN (at least those) are stressed with disk I/O like 
never before, like me.
You can NOT understand what I say if you only have a 100 MBit/sec 
connection to the internat, and I beg you to imagine what would happen to 
you if you had "unlimited" bandwidth if you like to undrrstand what I am 
trying to point out.
Unlimited bandwith is not THE freedom - it is simply a joke to make you 
see your REAL bottlenecks...

This "acceptance" is OK, this is wonderful, this is showing that a server 
is worth the name, but actually almost every server belongs to lots of 
other servers to serve his role, and so let me show the dark side of 
those very very "social" happenings;

ftp.gwdg.de is running into timeouts of this kind:

rsync: connection unexpectedly closed (1345637 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(342)

while contacting the masters of almost all important primary sources 
which get mirrored using rsync at ftp.gwdg.de.
It happens after building the "remote data base" and the "local database", 
just during the start of the real "fetch" actions.

Why?

Regardless how busy the server is, the client fulfills the "remote data 
base" phase first, and then he is starting his "local data base" phase 
second.
With a high level of "servility" (f.e. 5000 FTP sessions. 900 HTTP  
sessions and 200 RSYNC sessions in parallel and 4 TB of output a day, 
like seen on ftp.gwdg.de all these days since saturday), this "local" 
rsync phase may need a long long (currently very long long here) time 
because the local filesystem at ftp.gwdg.de is very busy and the Linux 
buffer cache performance is relatively bad in kernel 2.4.
After fulfilling it, we reach the state that we would like to fetch the 
first piece of new data from the server.

What happens?

See above.

rsync: connection unexpectedly closed (1345637 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(342)

Again, why?

Because server admins like to sleep with sweet dreams and are willing to 
help themseves to achieve that.

So they put into their own /etc/rsyncd,conf a line like

timeout=<low because I need my sweet dreams>

Natural, true "human" behaviour, fully understandable.
Yes, but not very much "socially aware".

I see two solutions:

 1. the rsync maintainer delivers the product with a default of

timeout=0

for /etc/rsyancd.conf and tries to awake social responsiveness by 
presenting some special words with "man rsyncd.conf" mentioning the 
situation of very busy clients.,

 2. all the maintainers of "socially relevant" Linux ressources end their 
small dsreams and enter the bigger one:

timeout=<almost unlimited>

to help the society which is nothing more than a thousand time JUST YOU 
INDIVIDUAL.

The third solution could be an option with rsync which forces to turn the 
database phases vice versa (build "local" first. then contact the  
server), but this would not be the best solution.

So plese rsync admins, put

  timeout=<almost unlimited>

into your /etc/rsyncd.conf.

ftp:3 03:47:14 ~ # grep timeout /etc/rsyncd.conf
timeout = 15000
ftp:3 03:47:16 ~ # 


Cheeers -e
-- 
Eberhard Moenkeberg (emoenke at gwdg.de, em at kki.org)


More information about the rsync mailing list