MySQL/Galera is synchronous multi-master cluster for MySQL/InnoDB database. The application which commonly had access to your cluster can commit on each server, then Galera will replicate all the data accross the whole cluste. A commit is called an RBS event. For the high-availability purpose Galera needs a minimum of 3 machines even if it is able to work with 2 nodes. There are several replication methods like mysqldump (default) and Rsync. In order to manage the queue and the replication system, Galera uses a Certification-based replication feature. This functionnality provides an alternative againts the classic replication methods.
How does it work?
clients
| | |
V V V
,----------------.
| |
| application | <-- e.g. MySQL server
| |
================== <-- wsrep API
| wsrep provider | <-- e.g. Galera
`----------------'
|
V
replication to other nodes
Application writes on any node
The commit is replicated through the other nodes. Each transaction had an ID
True parallel replication, on row level and ID check
Most relevant features:
Synchronous replication
Easy to scale
Automatic node joining
Automatic membership control, failed nodes drop from the cluster
Active-active multi-master topology
No SPOF
Performance oriented
If you want to know more about the topologies you can built, see this page et ici
127.0.0.1 localhost
127.0.0.1 galera-node01
127.0.1.1 galera-node01
10.0.0.1 galera-node01
10.0.0.2 galera-node02
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
NIC parameters:
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet dhcp
auto eth1
iface eth1 inet static
address 10.0.0.1
netmask 255.0.0.0
NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MySQL SERVERS IN PRODUCTION USE! PLEASE READ EACH STEP CAREFULLY!
In order to log into MySQL to secure it, we'll need the current password for the root user. If you've just installed MySQL, and you haven't set the root password yet, the password will be blank, so you should just press enter here.
Enter current password for root (enter for none): OK, successfully used password, moving on...
Setting the root password ensures that nobody can log into the MySQL root user without the proper authorisation.
Set root password? [Y/n] Y New password: Re-enter new password: Password updated successfully! Reloading privilege tables.. ... Success!
By default, a MySQL installation has an anonymous user, allowing anyone to log into MySQL without having to have a user account created for them. This is intended only for testing, and to make the installation go a bit smoother. You should remove them before moving into a production environment.
Remove anonymous users? [Y/n] Y ... Success!
Normally, root should only be allowed to connect from 'localhost'. This ensures that someone cannot guess at the root password from the network.
Disallow root login remotely? [Y/n] Y ... Success!
By default, MySQL comes with a database named 'test' that anyone can access. This is also intended only for testing, and should be removed before moving into a production environment.
Remove test database and access to it? [Y/n] Y - Dropping test database... ... Success! - Removing privileges on test database... ... Success!
Reloading the privilege tables will ensure that all changes made so far will take effect immediately.
Reload privilege tables now? [Y/n] Y ... Success!
Cleaning up...
All done! If you've completed all of the above steps, your MySQL installation should now be secure.
Thanks for using MySQL!
Modify the bind-address parameter with the host-only NIC (eth1):
galera-node01:~$ sudo sed -i s/'127.0.0.1'/'10.0.0.1'/ /etc/mysql/my.cnf
The sed command simply does this on the node 01:
bind-address = 10.0.0.1
Don’t forget the second node.
##II.2.2. Galera configuration for a 2 nodes cluster
We configure the Galera replicator in /etc/mysq/conf.d/wsrep.cnf with:
We kill the temporary process and relaunch MySQL from the INIT script:
galera-node01:~$ sudo killall mysqld_safe galera-node01:~$ sudo service mysql restart * Stopping MySQL database server mysqld ...done. * Starting MySQL database server mysqld ...done.
Enter the MySQL shell, check the MySQL and wsrep version:
galera-node01:~$ mysql -u root -p Enter password: Welcome TO the MySQL monitor. Commands END WITH ; OR \g. Your MySQL connection id IS 6 Server version: 5.5.20 SOURCE distribution, wsrep_23.4.r3713
Copyright (c) 2000, 2010, Oracle AND/OR its affiliates. ALL rights reserved. This software comes WITH ABSOLUTELY NO WARRANTY. This IS free software, AND you are welcome TO MODIFY AND redistribute it UNDER the GPL v2 license
TYPE 'help;' OR '\h' FOR help. TYPE '\c' TO clear the CURRENT INPUT statement.
If wsrep_ready is ON, it works! You also have to check the wsrep_cluster_size parameter. If wsrep_ready is ON and wsrep_cluster_size is egal to number of the nodes, then your cluster is properly working. Sometimes wsrep_ready is ON and your wsrep_cluster_size is egal to 1. This situation is pretty bad because you obviously are in a split brain situation. To solve this verify your gcomm:// urls and restart MySQL.
I wrote a simple bash script to check this:
The simpliest way to avoid any split-brain situation is to setup the garbd daemon like so. Ideally the daemon will be running on an extra server, a load-balancer for example:
$ garbd -a gcomm://10.0.0.1:4567 -g my_wsrep_cluster -l /tmp/1.out -d
Simply choose one of your 2 nodes and garbd will do the rest. It will automatically connect to the second node and will be a member of the cluster. Thus your wsrep_cluster_size variable should be 3. If the dedicated link between the 2 nodes goes down, garbd will act as a replication relay, this can be really useful.
mysql> SHOW STATUS LIKE 'wsrep_cluster_size' ;" +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | wsrep_cluster_size | 3 | +--------------------+-------+
This script will perform a simple check of your galera state:
#!/bin/bash
echo"Enter your MySQL user" read MYSQL_USER
echo"Enter your MySQL password" stty -echo read MYSQL_PASSWD stty echo
STATUS=$(mysql -u$MYSQL_USER -p$MYSQL_PASSWD -N -s-e"show status like 'wsrep_ready';" | awk '{print $2}') SIZE=$(mysql -u$MYSQL_USER -p$MYSQL_PASSWD -N -s-e"show status like 'wsrep_cluster_size' ;" | awk '{print $2}' | sed -n '2p')
if [[ ${STATUS} = "ON" ]] ; then if [[ ${SIZE}-lt 2 ]] ; then echo"Split-brain!" else echo"Galera is perfectly working" fi else echo"The replication is NOT working" fi
The connection is well established between the nodes:
galera-node01:~$ sudo netstat -plantu | grep mysqld | grep ESTABLISHED tcp 0 0 10.0.0.1:4567 10.0.0.2:43370 ESTABLISHED 15082/mysqld tcp 0 0 10.0.0.1:39691 10.0.0.10:4567 ESTABLISHED 13929/mysqld
Here the 10.0.0.10 address is the address of the third machine, the one which host the garbd daemon.
Check the replication by creating databases, tables and fields :).
Important note: when your 2 node replication is setup, change the address of the node with this gcomm:// address by gcomm://other_node_address. You only have to use this address during the installation process. In our setup, it’s the node 01.
At the end you should have this:
node 01: gcomm://10.0.0.2
node 02: gcomm://10.0.0.1
If the node 01 goes down, the second node will continue to perform commits, when the node 01 will get back to life it will automatically synchronise with the node 02 and vice et versa. A kind of ‘cross replication’.
###II.2.3. Add a new node
When you add a new node, the new one is called joiner and the node which replicate is called donnor. To add a new node you have to:
Install the codership MySQL version
Install Galera
Configure the gcomm:// address with the donnor IP address
After that:
The new node will request the donnor using a SST request.
The state of the requested node will change from JOINED to DONOR
The donnor, will perform a mysqldump or a rsync sync. It depends on the SST methods in use. The SST method also implies several things. For instance both methods mysqldump and rsync are blocking option. It means that during the operation the tables will be lock on the DONOR node and only on this node, the other nodes of the cluster will be available on write. The donor will have a READ-ONLY state until the end of the process.
The donnor will load this dump on the new node.
###II.2.4. Using replication through SSL
Since the 0.8 version Galera provides the SSL replication, it prevents man-in-the-middle attacks and also a powerful authentication system based on certificates. It’s a really convenient way to allow specific nodes to connect the cluster.
For this purpose we have to active those options:
Generate your certs and copy them to the others nodes:
galera-node01:~/cert$ openssl req -new -x509 -days 365000 -nodes -keyout key.pem -out cert.pem Generating a 1024 bit RSA privatekey .......................++++++ .................++++++ writing newprivatekeyto'key.pem' ----- You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [AU]: State or Province Name (full name) [Some-State]: Locality Name (eg, city) []: Organization Name (eg, company) [Internet Widgits Pty Ltd]: Organizational Unit Name (eg, section) []: Common Name (eg, YOUR name) []: Email Address []:
Be careful, here we have to restart from the beginning. Why? If we configure SSL, it means we have to do it on each server. Thereby we will modify each configuration file for each server. What will happened?
After my Galera Cluster was up and running, I had plenty of questions, some of them:
III.1. How galera prevents split-brain situation?
Split-brain is a state in which all the nodes of a cluster cannot determinate there membership. If a node is down for a couple of second due to a network issue, an other may think that this one is down and want to take the lead and launch the service. Each node in the cluster may mistakenly decide to run the service but the others nodes are still running. This will cause data corruption and duplicate instance. The solution offered by Galera is to use a little daemon called garbd, a stateless daemon create to avoid split-brain. This daemon is a part of the Galera replicator. As I previously said, the garbd daemon will be a member of the cluster and can act as a replication relay if the dedicated link between the nodes goes down.
No matter your topology (2 or more node). For example, if the node 02 goes down, the one with the gcomm://another_node_ipaddress, the data will keep be replicate to the other node. When the node 02 will go back, the data will be commited.
III.3. Change your gcomm:// address without downtime
For this purpose you have to change a variable in MySQL:
mysql> SHOW VARIABLES LIKE 'wsrep_cluster_address'; +-----------------------+------------------+ | Variable_name | VALUE | +-----------------------+------------------+ | wsrep_cluster_address | gcomm://10.0.0.3 | +-----------------------+------------------+ 1 ROW IN SET (0.00 sec)
mysql> SET GLOBAL wsrep_cluster_address='gcomm://10.0.0.2'; Query OK, 0 ROWS affected (3.51 sec)
mysql> SHOW VARIABLES LIKE 'wsrep_cluster_address'; +-----------------------+------------------+ | Variable_name | VALUE | +-----------------------+------------------+ | wsrep_cluster_address | gcomm://10.0.0.2 | +-----------------------+------------------+ 1 ROW IN SET (0.00 sec)
III.4. Check your version
Check your MySQL version:
mysql> SHOW GLOBAL VARIABLES LIKE 'version'; +---------------+--------+ | Variable_name | VALUE | +---------------+--------+ | version | 5.5.20 | +---------------+--------+ 1 ROW IN SET (0.00 sec)
Check your wsrep version:
mysql> SHOW GLOBAL STATUS LIKE 'wsrep_provider_version'; +------------------------+--------------+ | Variable_name | VALUE | +------------------------+--------------+ | wsrep_provider_version | 23.2.0(r120) | +------------------------+--------------+ 1 ROW IN SET (0.00 sec)
Comments