[clug] [tech] Postgres Replication

Arjen Lentz arjen at lentz.com.au
Tue Sep 22 01:01:06 MDT 2009


Hi Steve

----- "Steve McInerney" <steve at stedee.id.au> wrote:
> The others have mentioned slony, adding an additional "me too" there;
> Just some gotchas to be aware of:
> 
> * watch out for replication lag. We can and do get very large delays
> when massive transaction updates go thru.

This sounds similar to the original/first implementation of MySQL 3.23 replication around 1999.
It had a single thread that both retrieved events from the master and executed them.
We found that with long-executing statements, the lag would add up to a point where it becomes impossible to catch up.

In MySQL 4.0 (2000-2001+) there are two threads, one handling the receipt of master events to relay log, and a separate thread doing the execution. This means that during execution new events are still being received and stored. Usually the execution (SQL) thread just reads from the disk cache as it's close enough behind the IO thread. The result is much more stable over time.

One issue is still that there's a single SQL thread executing events in serialised form, where on the master they were executed in parallel. So a slave essentially has to do more work to do the same, and that too can cause lag.
Youtube came up with a pretty neat trick, there's another thread/process that reads the relay log just ahead of the execution thread, transforms queries into an equivalent SELECT, and runs it. This primes the caches (datapage/index buffering in memory). Another implementation of the same trick is now part of the Maatkit tools (maatkit.org)


Replication is used for many reasons, not just failover.
It allows for read scaling, admin on live systems (with dual masters), purposely timelagged slaves to ease recovery from user errors, and so on.
DRBD (mentioned earlier) has its use, but does not have this wide scope - also, if a system stuffs up in say the filesystem, DRBD will naturally give you a perfect copy of the stuffup.


Cheers,
Arjen.
-- 
Arjen Lentz, Exec.Director @ Open Query (http://openquery.com)
Exceptional Services for MySQL at a fixed budget.

Follow our blog at http://openquery.com/blog/
OurDelta: enhanced builds for MySQL @ http://ourdelta.org


More information about the linux mailing list