[clug] googlebot doing funny things in logs
Edward C. Lang
edlang at edlang.org
Wed Jun 15 21:58:08 MDT 2011
On Thu, Jun 16, 2011 at 01:08:47PM +1000, Hal Ashburner wrote:
> But back to the original point how does google even know that
> /mythweb exists, given nothing links to it, it's not my usual
> location for it, it is, and I believe always has been behind a
> password, and until I forgot it on the machine changeover on the
> weekend there was a robots.txt disallowing everything from anyone if
> they're remotely polite - which googlebot claims to be and usually
> seem to be.
> 1) I must have originally had it placed at /mythweb and linked from
> my front page and have forgotten I did this over 2 years ago while
> exposing it via the firewall.
> 2) I must have not had a robots.txt at that time as well as now.
> 3) I must have let the password protection down at that time.
> 4) Googlebot must have proceeded to do "it's normal practice"
> following links and indexing pages with all of these things
> simultaneously in effect and also remembered all this about my
> domain for over 2 years, then followed the memory of links rather
> than actual links, while keeping the memory of links in its index
> when refused access.
> And there's basically no enquiry I can make with google about it.
For the cost of a Google account you could request that the pages be
removed frrom their cache and future crawl attempts:
> Any one of those I'd say, absolutely fair enough, I'm a goose and I
> make mistakes and the mythweb setup was an experimental diversion,
> much as the whole mythtv thing was and is.
> Two of them, yeah why not, coins land heads 4 times in a row, sure.
> Three, sure, slap my head about being a bigger goose than I thought
> but we all have bad days, right? And I probably was having at least
> one or two of those 2 years ago now I think about it.
> 3 + google having an index of the structure of links from over 2 years ago?
> Well okay. It *is* possible. Interesting if that's what it is,
> though, huh? I'd also have thought all that occurring simultaneously
> unlikely. If there's no alternative explanation I guess I'd have
> been doing some wrong thinking about that too.
> As it is, there's no damage done, presumably the Goog will "forget"
> the no longer relevant links that are on it's page for my domain one
> day given they're not even indexed, but it might take more than 2
> years. It's just a bit weird. So yeah, the "best" explanation I have
> is is "iSuck" and google odd.
> And if nobody else sees anything like it in the world I guess we can
> safely say that Googlebot is not conducting "research" or I can have
> paranoid fantasies about being specifically targeted by googlebot
> which, I'd have to say, is very unlikely. A lot less likely than
> them acknowledging and apologising for their repeated telephone
> script "our engineers were brainstorming and suggested you'd be
> ideal for google and google wants you - but please read this ad and
> then passionately make your case why google should deign to consider
> you." Then going on to refusing to put me in touch with one of these
> engineers who they claim are friends of mine and who personally
> recommended me to discuss why they would think such a thing and
> /whether/ I'm any kind of fit for goog - which I might well not be
> at all. Common sense tells you it's just a bit of webstalking by
> paid placement consultants who are not google employees priming
> applications - paid on commission for applicants who get through
> weeks of interviews, still want the gig, and get hired - to the tune
> of 25% of salary. A good probability high paying raffle ticket for a
> couple of phonecalls and some web searching. If you toy with them
> long enough they'll admit to the initial scripted dishonesty. The
> placement consultant industry sucks very hard as most of us are
> aware - Google are no different to the norm for large, poisonous
> corporations on that front even if they do make claims to be
> different on other fronts. I'd love it if they took to that wart on
> their face, head on, as it were and disrupted the odiousness of the
> industry, wouldn't we all? Some also want ponies, ponies that
> defecate world peace... ;-)
Truth be told, they were trying to find out if you were watching the
My Little Pony Friendship is Magic TV series.
More information about the linux