Posted By

alexwilliams on 08/07/09


Tagged

mysql database master failover backend pool haproxy slave redundancy


Versions (?)

Who likes this?

2 people have marked this snippet as a favorite

alexwilliams
umang_nine


HAProxy database load-balancing and redundancy


 / Published in: Apache
 

This hasn't been tested in production yet. Use at your own risk

The read cluster contains 7 servers which can accept READ requests, but not all at once (see Failure Scenarios).

The write cluster contains 4 servers which can accept WRITE requests (only 1 at a time though, don't be stupid).

Port 9201 is an HTTP service which checks if replication is running fine (SlaveIO yes, SlaveSQL yes, SecondsBehindMaster 0).

Port 9200 is an HTTP service which checks if mysql is running fine (results appear after performing 'show databases').

In both cases (ports 9200 and 9201), a simple shell script is installed on each server which then runs as a STREAM daemon through xinetd) and outputs a "200 OK" or "503 Service Unavailable".

  1. # Load-balanced IPs for DB writes and reads
  2. #
  3. frontend db_write
  4. bind 172.16.0.51:3306
  5. default_backend cluster_db_write
  6.  
  7. frontend db_read
  8. bind 172.16.0.52:3306
  9. default_backend cluster_db_read
  10.  
  11. # Monitor DB server availability
  12. #
  13. frontend monitor_db01
  14. #
  15. # set db01_backup to 'up' or 'down'
  16. #
  17. bind 127.0.0.1:9301
  18. mode http
  19. option nolinger
  20.  
  21. acl no_repl_db01 nbsrv(db01_replication) eq 0
  22. acl no_repl_db02 nbsrv(db02_replication) eq 0
  23. acl no_db01 nbsrv(db01_status) eq 0
  24. acl no_db02 nbsrv(db02_status) eq 0
  25.  
  26. monitor-uri /dbs
  27. monitor fail unless no_repl_db01 no_repl_db02 no_db02
  28. monitor fail if no_db01 no_db02
  29.  
  30. frontend monitor_db02
  31. #
  32. # set db02_backup to 'up' or 'down'
  33. #
  34. bind 127.0.0.1:9302
  35. mode http
  36. option nolinger
  37.  
  38. acl no_repl_db01 nbsrv(db01_replication) eq 0
  39. acl no_repl_db02 nbsrv(db02_replication) eq 0
  40. acl no_db01 nbsrv(db01_status) eq 0
  41. acl no_db02 nbsrv(db02_status) eq 0
  42.  
  43. monitor-uri /dbs
  44. monitor fail unless no_repl_db01 no_repl_db02 no_db01
  45. monitor fail if no_db01 no_db02
  46.  
  47. frontend monitor_db03
  48. #
  49. # set db03 read-only slave to 'down'
  50. #
  51. bind 127.0.0.1:9303
  52. mode http
  53. option nolinger
  54.  
  55. acl no_repl_db03 nbsrv(db03_replication) eq 0
  56. acl no_repl_db01 nbsrv(db01_replication) eq 0
  57. acl db02 nbsrv(db02_status) eq 1
  58.  
  59. monitor-uri /dbs
  60. monitor fail if no_repl_db03
  61. monitor fail if no_repl_db01 db02
  62.  
  63. frontend monitor_db04
  64. #
  65. # set db04 read-only slave to 'down'
  66. #
  67. bind 127.0.0.1:9304
  68. mode http
  69. option nolinger
  70.  
  71. acl no_repl_db04 nbsrv(db04_replication) eq 0
  72. acl no_repl_db01 nbsrv(db01_replication) eq 0
  73. acl db02 nbsrv(db02_status) eq 1
  74.  
  75. monitor-uri /dbs
  76. monitor fail if no_repl_db04
  77. monitor fail if no_repl_db01 db02
  78.  
  79. frontend monitor_db05
  80. #
  81. # set db05 read-only slave to 'down'
  82. #
  83. bind 127.0.0.1:9305
  84. mode http
  85. option nolinger
  86.  
  87. acl no_repl_db05 nbsrv(db05_replication) eq 0
  88. acl no_repl_db02 nbsrv(db02_replication) eq 0
  89. acl db01 nbsrv(db01_status) eq 1
  90.  
  91. monitor-uri /dbs
  92. monitor fail if no_repl_db05
  93. monitor fail if no_repl_db02 db01
  94.  
  95. # Monitor for split-brain syndrome
  96. #
  97. frontend monitor_splitbrain
  98. #
  99. # set db01_splitbrain and db02_splitbrain to 'up'
  100. #
  101. bind 127.0.0.1:9300
  102. mode http
  103. option nolinger
  104.  
  105. acl no_repl01 nbsrv(db01_replication) eq 0
  106. acl no_repl02 nbsrv(db02_replication) eq 0
  107. acl db01 nbsrv(db01_status) eq 1
  108. acl db02 nbsrv(db02_status) eq 1
  109.  
  110. monitor-uri /dbs
  111. monitor fail unless no_repl01 no_repl02 db01 db02
  112.  
  113. ##
  114. ## BACKEND ##
  115. ##
  116.  
  117. # Check every DB server replication status
  118. # - perform an http check on port 9201 (replication status)
  119. # - set to 'down' if response is '503 Service Unavailable'
  120. # - set to 'up' if response is '200 OK'
  121. #
  122. backend db01_replication
  123. mode tcp
  124. balance roundrobin
  125. option tcpka
  126. option httpchk
  127. server db01 172.16.0.61:3306 check port 9201 inter 1s rise 1 fall 1
  128.  
  129. backend db02_replication
  130. mode tcp
  131. balance roundrobin
  132. option tcpka
  133. option httpchk
  134. server db02 172.16.0.62:3306 check port 9201 inter 1s rise 1 fall 1
  135.  
  136. backend db03_replication
  137. mode tcp
  138. balance roundrobin
  139. option tcpka
  140. option httpchk
  141. server db03 172.16.0.63:3306 check port 9201 inter 1s rise 1 fall 1
  142.  
  143. backend db04_replication
  144. mode tcp
  145. balance roundrobin
  146. option tcpka
  147. option httpchk
  148. server db04 172.16.0.64:3306 check port 9201 inter 1s rise 1 fall 1
  149.  
  150. backend db05_replication
  151. mode tcp
  152. balance roundrobin
  153. option tcpka
  154. option httpchk
  155. server db05 172.16.0.65:3306 check port 9201 inter 1s rise 1 fall 1
  156.  
  157. # Check Master DB server mysql status
  158. # - perform an http check on port 9200 (mysql status)
  159. # - set to 'down' if response is '503 Service Unavailable'
  160. # - set to 'up' if response is '200 OK'
  161. #
  162. backend db01_status
  163. mode tcp
  164. balance roundrobin
  165. option tcpka
  166. option httpchk
  167. server db01 172.16.0.61:3306 check port 9200 inter 1s rise 2 fall 2
  168.  
  169. backend db02_status
  170. mode tcp
  171. balance roundrobin
  172. option tcpka
  173. option httpchk
  174. server db02 172.16.0.62:3306 check port 9200 inter 1s rise 2 fall 2
  175.  
  176. # DB write cluster
  177. # Failover scenarios:
  178. # - replication 'up' on db01 & db02 = writes to db01
  179. # - replication 'down' on db02 = writes to db01
  180. # - replication 'down' on db01 = writes to db02
  181. # - replication 'down' on db01 & db02 = go nowhere, split-brain, cluster FAIL!
  182. # - mysql 'down' on db02 = writes to db01_backup
  183. # - mysql 'down' on db01 = writes to db02_backup
  184. # - mysql 'down' on db01 & db02 = go nowhere, cluster FAIL!
  185. #
  186. backend cluster_db_write
  187. #
  188. # - max 1 db server available at all times
  189. # - db01 is preferred (top of list)
  190. # - db_backups set their 'up' or 'down' based on results from monitor_dbs
  191. #
  192. mode tcp
  193. option tcpka
  194. balance roundrobin
  195. option httpchk GET /dbs
  196. server db01 172.16.0.61:3306 weight 1 check port 9201 inter 1s rise 2 fall 1
  197. server db02 172.16.0.62:3306 weight 1 check port 9201 inter 1s rise 2 fall 1 backup
  198. server db01_backup 172.16.0.61:3306 weight 1 check port 9301 inter 1s rise 2 fall 2 addr 127.0.0.1 backup
  199. server db02_backup 172.16.0.62:3306 weight 1 check port 9302 inter 1s rise 2 fall 2 addr 127.0.0.1 backup
  200.  
  201. # DB read cluster
  202. # Failover scenarios
  203. # - replication 'up' on db01 & db02 = reads on db01, db02, all db_slaves
  204. # - replication 'down' on db02 = reads on db01, slaves of db01
  205. # - replication 'down' on db01 = reads on db02, slaves of db02
  206. # - replication 'down' on db01 & db02 = reads on db01_splitbrain and db01_splitbrain only
  207. # - mysql 'down' on db02 = reads on db01_backup, slaves of db01
  208. # - mysql 'down' on db01 = reads on db02_backup, slaves of db02
  209. # - mysql 'down' on db01 & db02 = go nowhere, cluster FAIL!
  210. #
  211. backend cluster_db_read
  212. #
  213. # - max 2 master db servers available at all times
  214. # - max N slave db servers available at all times except during split-brain
  215. # - dbs track 'up' and 'down' of dbs in the cluster_db_write
  216. # - db_backups track 'up' and 'down' of db_backups in the cluster_db_write
  217. # - db_splitbrains set their 'up' or 'down' based on results from monitor_splitbrain
  218. #
  219. mode tcp
  220. option tcpka
  221. balance roundrobin
  222. option httpchk GET /dbs
  223. server db01 172.16.0.61:3306 weight 1 track cluster_db_write/db01
  224. server db02 172.16.0.62:3306 weight 1 track cluster_db_write/db02
  225. server db01_backup 172.16.0.61:3306 weight 1 track cluster_db_write/db01_backup
  226. server db02_backup 172.16.0.62:3306 weight 1 track cluster_db_write/db02_backup
  227. server db01_splitbrain 172.16.0.61:3306 weight 1 check port 9300 inter 1s rise 1 fall 2 addr 127.0.0.1
  228. server db02_splitbrain 172.16.0.62:3306 weight 1 check port 9300 inter 1s rise 1 fall 2 addr 127.0.0.1
  229. #
  230. # Scaling & redundancy options
  231. # - db_slaves set their 'up' or 'down' based on results from monitor_dbs
  232. # - db_slaves should take longer to rise
  233. #
  234. server db03_slave 172.16.0.63:3306 weight 1 check port 9303 inter 1s rise 5 fall 1 addr 127.0.0.1
  235. server db04_slave 172.16.0.64:3306 weight 1 check port 9304 inter 1s rise 5 fall 1 addr 127.0.0.1
  236. server db05_slave 172.16.0.65:3306 weight 1 check port 9305 inter 1s rise 5 fall 1 addr 127.0.0.1

Report this snippet  

Comments

RSS Icon Subscribe to comments
Posted By: alexwilliams on August 7, 2009

I will provide more explanations of this configuration on my blog within the next few days. Subscribe to my RSS feed to stay informed.

Posted By: alexwilliams on August 12, 2009

Blog post has been written here: http://alexwilliams.ca/blog/2009/08/10/using-haproxy-for-mysql-failover-and-redundancy/

Posted By: alexwilliams on August 12, 2009

err.... here's the correct link

Posted By: alexwilliams on August 12, 2009

one last try

You need to login to post a comment.