[clug] Out of Memory: Kill process 2689 (mysqld) score 33827 and children.

Wed Aug 18 15:47:38 MDT 2010

On 18 August 2010 23:38, Daniel Pittman <daniel at rimspace.net> wrote:
> David Schoen <dave at lyte.id.au> writes:
>> I'm much more familiar with one specific application - MySource Matrix. It
>> runs on PostgreSQL and Oracle, not MySQL. Specifc problems vary drastically
>> but one great way to deadlock PHP is to make use of the native session
>> handler and find any old URL that crashes that thread, then just keep
>> sending in requests until Apache runs out of slots (or in this case gets
>> killed off by OOM).
>
> Ouch.  Forgive me for having to say it, but this is a serious question: is
> this something you found across multiple PHP versions, or was it limited to a
> particular buggy release or something?

I've seen php 4.0.* -> 5.2.* crash due to various bad configurations
(the 5.2 strains have all been on Solaris though where PHP sort of
only *just* works). Once you can get any sort of crash that doesn't
correctly release a lock or a resource you're open to this sort of
crash.

> I don't manage enough PHP (increasingly thankfully) that I have run into this
> much, so would have imagined that Apache would relatively gracefully handle
> this, by eventually terminating the worker at the very least...

You can lower the TimeOut parameter in Apache which will help a bit,
but you're just tuning the speed that bad requests need to come in at
to keep your procs full, not completely solving the problem.

We partly solve the problem by using Squid as it will pool connections
more sensibly (especially with collapsed forwarding on). I would like
to spend some serious amount of time learning Varnish as I think we
can do an even better job, but there's always a bigger priority.

I also wrote a script that watches server status and knows enough
about our app to figure out when threads are obviously bad and kill
them off (also alert about the fact in to cron, so that a sys admin
somewhere finds out the server is still failing). That helps a bit as
it gives some monitoring that highlights which pages are failing and
when without having to sit there watching it, it also keeps the server
alive until someone can fix it properly. I'm trying to work out if I
can make that GPL... if I can stick a GPL header in it I'll publish it
somewhere :)

[...]
>
>> The point is still to strace/lsof the procs and see what's going on, to do
>> that the kernel needs to remain responsive while Apache crashes, which
>> usually means lowering MaxClients.
>
> *nod* I confess, one reason I prefer FastCGI (style) operation for PHP (and
> most other languages) is that it isolates the web server and application
> server life-cycle.
>
> Not always perfectly, but better than the in-process handlers.  That usually
> makes it easier to tune, and often easier to support more client concurrency,
> than doing it all in one big mess.

I agree, not having to tear up/down the PHP process each time could
save a lot of the execution time on some of the faster page designs I
play with. On the slower ones it's probably more like 2%, so there are
easier gains elsewhere :)

Squiz are also introducing an object cache that will persist more of
the core state in memory across threads in the next major release,
there's still the tear up and tear down time in PHP to cope with
though, and forced serialised PHP sessions.

On the PHP sessions front we do currently use memcached sessions to
provide parallel threads for individual users on some installs. But
that's a bug (not a feature) of memcached. v3.0.4 has "Added session
locking to avoid concurrency problems with AJAX apps", see:
http://pecl.php.net/package-changelog.php?package=memcache

so if at some point we feel the need to upgrade to 3.0.4 we're going
to need to fix our sessions some other way (and yes I know there are
other inherint problems with relying on a bug in memcached to provide
concurrent sessions).

[...]

Cheers,
Dave