This hasn't been tested in production yet. Use at your own risk
The read cluster contains 7 servers which can accept READ requests, but not all at once (see Failure Scenarios).
The write cluster contains 4 servers which can accept WRITE requests (only 1 at a time though, don't be stupid).
Port 9201 is an HTTP service which checks if replication is running fine (SlaveIO yes, SlaveSQL yes, SecondsBehindMaster 0).
Port 9200 is an HTTP service which checks if mysql is running fine (results appear after performing 'show databases').
In both cases (ports 9200 and 9201), a simple shell script is installed on each server which then runs as a STREAM daemon through xinetd) and outputs a "200 OK" or "503 Service Unavailable".
# Load-balanced IPs for DB writes and reads # frontend db_write bind 172.16.0.51:3306 default_backend cluster_db_write frontend db_read bind 172.16.0.52:3306 default_backend cluster_db_read # Monitor DB server availability # frontend monitor_db01 # # set db01_backup to 'up' or 'down' # bind 127.0.0.1:9301 mode http option nolinger acl no_repl_db01 nbsrv(db01_replication) eq 0 acl no_repl_db02 nbsrv(db02_replication) eq 0 acl no_db01 nbsrv(db01_status) eq 0 acl no_db02 nbsrv(db02_status) eq 0 monitor-uri /dbs monitor fail unless no_repl_db01 no_repl_db02 no_db02 monitor fail if no_db01 no_db02 frontend monitor_db02 # # set db02_backup to 'up' or 'down' # bind 127.0.0.1:9302 mode http option nolinger acl no_repl_db01 nbsrv(db01_replication) eq 0 acl no_repl_db02 nbsrv(db02_replication) eq 0 acl no_db01 nbsrv(db01_status) eq 0 acl no_db02 nbsrv(db02_status) eq 0 monitor-uri /dbs monitor fail unless no_repl_db01 no_repl_db02 no_db01 monitor fail if no_db01 no_db02 frontend monitor_db03 # # set db03 read-only slave to 'down' # bind 127.0.0.1:9303 mode http option nolinger acl no_repl_db03 nbsrv(db03_replication) eq 0 acl no_repl_db01 nbsrv(db01_replication) eq 0 acl db02 nbsrv(db02_status) eq 1 monitor-uri /dbs monitor fail if no_repl_db03 monitor fail if no_repl_db01 db02 frontend monitor_db04 # # set db04 read-only slave to 'down' # bind 127.0.0.1:9304 mode http option nolinger acl no_repl_db04 nbsrv(db04_replication) eq 0 acl no_repl_db01 nbsrv(db01_replication) eq 0 acl db02 nbsrv(db02_status) eq 1 monitor-uri /dbs monitor fail if no_repl_db04 monitor fail if no_repl_db01 db02 frontend monitor_db05 # # set db05 read-only slave to 'down' # bind 127.0.0.1:9305 mode http option nolinger acl no_repl_db05 nbsrv(db05_replication) eq 0 acl no_repl_db02 nbsrv(db02_replication) eq 0 acl db01 nbsrv(db01_status) eq 1 monitor-uri /dbs monitor fail if no_repl_db05 monitor fail if no_repl_db02 db01 # Monitor for split-brain syndrome # frontend monitor_splitbrain # # set db01_splitbrain and db02_splitbrain to 'up' # bind 127.0.0.1:9300 mode http option nolinger acl no_repl01 nbsrv(db01_replication) eq 0 acl no_repl02 nbsrv(db02_replication) eq 0 acl db01 nbsrv(db01_status) eq 1 acl db02 nbsrv(db02_status) eq 1 monitor-uri /dbs monitor fail unless no_repl01 no_repl02 db01 db02 ## ## BACKEND ## ## # Check every DB server replication status # - perform an http check on port 9201 (replication status) # - set to 'down' if response is '503 Service Unavailable' # - set to 'up' if response is '200 OK' # backend db01_replication mode tcp balance roundrobin option tcpka option httpchk server db01 172.16.0.61:3306 check port 9201 inter 1s rise 1 fall 1 backend db02_replication mode tcp balance roundrobin option tcpka option httpchk server db02 172.16.0.62:3306 check port 9201 inter 1s rise 1 fall 1 backend db03_replication mode tcp balance roundrobin option tcpka option httpchk server db03 172.16.0.63:3306 check port 9201 inter 1s rise 1 fall 1 backend db04_replication mode tcp balance roundrobin option tcpka option httpchk server db04 172.16.0.64:3306 check port 9201 inter 1s rise 1 fall 1 backend db05_replication mode tcp balance roundrobin option tcpka option httpchk server db05 172.16.0.65:3306 check port 9201 inter 1s rise 1 fall 1 # Check Master DB server mysql status # - perform an http check on port 9200 (mysql status) # - set to 'down' if response is '503 Service Unavailable' # - set to 'up' if response is '200 OK' # backend db01_status mode tcp balance roundrobin option tcpka option httpchk server db01 172.16.0.61:3306 check port 9200 inter 1s rise 2 fall 2 backend db02_status mode tcp balance roundrobin option tcpka option httpchk server db02 172.16.0.62:3306 check port 9200 inter 1s rise 2 fall 2 # DB write cluster # Failover scenarios: # - replication 'up' on db01 & db02 = writes to db01 # - replication 'down' on db02 = writes to db01 # - replication 'down' on db01 = writes to db02 # - replication 'down' on db01 & db02 = go nowhere, split-brain, cluster FAIL! # - mysql 'down' on db02 = writes to db01_backup # - mysql 'down' on db01 = writes to db02_backup # - mysql 'down' on db01 & db02 = go nowhere, cluster FAIL! # backend cluster_db_write # # - max 1 db server available at all times # - db01 is preferred (top of list) # - db_backups set their 'up' or 'down' based on results from monitor_dbs # mode tcp option tcpka balance roundrobin option httpchk GET /dbs server db01 172.16.0.61:3306 weight 1 check port 9201 inter 1s rise 2 fall 1 server db02 172.16.0.62:3306 weight 1 check port 9201 inter 1s rise 2 fall 1 backup server db01_backup 172.16.0.61:3306 weight 1 check port 9301 inter 1s rise 2 fall 2 addr 127.0.0.1 backup server db02_backup 172.16.0.62:3306 weight 1 check port 9302 inter 1s rise 2 fall 2 addr 127.0.0.1 backup # DB read cluster # Failover scenarios # - replication 'up' on db01 & db02 = reads on db01, db02, all db_slaves # - replication 'down' on db02 = reads on db01, slaves of db01 # - replication 'down' on db01 = reads on db02, slaves of db02 # - replication 'down' on db01 & db02 = reads on db01_splitbrain and db01_splitbrain only # - mysql 'down' on db02 = reads on db01_backup, slaves of db01 # - mysql 'down' on db01 = reads on db02_backup, slaves of db02 # - mysql 'down' on db01 & db02 = go nowhere, cluster FAIL! # backend cluster_db_read # # - max 2 master db servers available at all times # - max N slave db servers available at all times except during split-brain # - dbs track 'up' and 'down' of dbs in the cluster_db_write # - db_backups track 'up' and 'down' of db_backups in the cluster_db_write # - db_splitbrains set their 'up' or 'down' based on results from monitor_splitbrain # mode tcp option tcpka balance roundrobin option httpchk GET /dbs server db01 172.16.0.61:3306 weight 1 track cluster_db_write/db01 server db02 172.16.0.62:3306 weight 1 track cluster_db_write/db02 server db01_backup 172.16.0.61:3306 weight 1 track cluster_db_write/db01_backup server db02_backup 172.16.0.62:3306 weight 1 track cluster_db_write/db02_backup server db01_splitbrain 172.16.0.61:3306 weight 1 check port 9300 inter 1s rise 1 fall 2 addr 127.0.0.1 server db02_splitbrain 172.16.0.62:3306 weight 1 check port 9300 inter 1s rise 1 fall 2 addr 127.0.0.1 # # Scaling & redundancy options # - db_slaves set their 'up' or 'down' based on results from monitor_dbs # - db_slaves should take longer to rise # server db03_slave 172.16.0.63:3306 weight 1 check port 9303 inter 1s rise 5 fall 1 addr 127.0.0.1 server db04_slave 172.16.0.64:3306 weight 1 check port 9304 inter 1s rise 5 fall 1 addr 127.0.0.1 server db05_slave 172.16.0.65:3306 weight 1 check port 9305 inter 1s rise 5 fall 1 addr 127.0.0.1
Comments
Subscribe to comments
You need to login to post a comment.

I will provide more explanations of this configuration on my blog within the next few days. Subscribe to my RSS feed to stay informed.
Blog post has been written here: http://alexwilliams.ca/blog/2009/08/10/using-haproxy-for-mysql-failover-and-redundancy/
err.... here's the correct link
one last try