Andrew Tridgell tridge at
Thu May 7 11:00:08 GMT 1998

I've noticed that after my appeal for mirrors last week several people
started mirroring without asking first. Don't do
this! It causes the problem of network costs to go up enormously.

For example, today we had about 150k hits on the server. About 60% of
these were from web robots mirroring the site! The robots only do a
"HEAD" on each page but that costs us almost as much as pulling in the
whole page as it is the ACKs and TCP setup costs not the data that
cost at our end.

I've been forced to take action on this as at this rate it would cost
us about $400 this month just to service these robots. I've changed
the apache config to deny access to 6 of these robots. If your robot
suddenly can't get through then this is why.

Also, if anyone does want to setup a web robot then please follow
these rules:

1) ask first. I can probably provide a MUCH more efficient way of
   mirroring than a web robot.
2) use a robot that obeys the rules on /robots.txt. Some of these
   robots were trying to take a complete copy of our cgi-bin area
   which is infinite in size!

We will have some official mirrors online soon. Please do not take
matters into your own hands. I know you mean well, but the solution is
worse than the problem.


