To make MySQL a totally reliable and available in cases of high load, database replication and clustering are a possible solution to achieve this
For more interesting posts on MySQL, please click on INDEX
As MySQL is one of the first open source databases, its replication capabilities have evolved in time.
What is Replication?
Replication enables data from one MySQL database server (master) to be replicated to one or more MySQL database servers (slave)
You can use replication to solve a number of different problems, including problems with performance, supporting the backup of different databases, and as part of a larger solution to possibly remedy system failures
The master server writes all database changes to the binary log – or binlog. The slave checks the binlog for these changes and writes them into a relay log. The relay log then writes these changes to the database.
Binary Logs
Replication is possible because of the binary log – or binlog.
We will need to understand how the binlog works in order to have control over the replication process and in order to be able to fix any problems that occur. The purpose of the binlog is to record changes made to the tables in the database. So the binlog does not record any queries that do not change data. The binary log contains “events” that describe these database changes such as table creation operations or changes to table data. It also contains events for statements that potentially could have made changes (for example, a DELETE statement which matched zero rows) – that is, unless row based logging is used. And the binary log also contains information about the execution time for each statement that updated data. The binlog is not just a single file, but a set of files that allows for easier database management – so you can remove old logs without disturbing newer ones. There is also a binlog index file, which keeps track of which binlog files exist. Only one binlog file is the active file – and this active file is the one that is currently being used for data writes. The binlog can then be used for replication, for point in time recovery in backups, and in some limited cases for auditing of data.
From the below command we can see the binlog events :
mysql> show binlog events\G
*************************** 1. row ***************************
Log_name: RESTORE-mysql-bin.000001
Pos: 4
Event_type: Format_desc
Server_id: 1
End_log_pos: 123
Info: Server ver: 5.7.35-log, Binlog ver: 4
The columns for each row are:
Log Name – the binlog file that is being referenced or was used for this statement.
Pos – this is the position in the file where the event starts – the first byte of the event The position is key in using the binlog to replicate data and when promo0ng slaves to masters.
Event Type – this is the type of event – there are about 27 different event types – such as Format_desc, Stop, Query, Xid, User var, Table_map, Update_rows, Rotate, Intvar
Server ID – the id of the server that created the event
End log position – the ending byte of the event – where this event ends and where the next one begins Info – information about the event – different information is printed for different events, but you can at least count on the query event to print the statement that it contains – unless you are using row based replication.
Replication Types
This is the list of the most used replication types for MySQL:
- Master-slave replication
- Master-master replication
- Group replication
- Multi-master cluster (available for MariaDB, which is the fork of MySQL)
Replication Architecture Basics :
This is how the events flow through a replication system from the master to the slaves:
1 – A session on the master accepts a statement from the client, executes the statement, and synchronizes with other sessions to ensure that each transaction is executed without conflicting with other changes made by other sessions.
2 – Just before the statement finishes execution, an entry consisting of one or more events is written to the binary log on the master.
3 – After the events have been written, a dump thread is created on the master when a slave I/O thread connects, the dump thread reads the events from the binary log, and sends them over to the slaves I/O thread. There is one dump thread per connected slave.
4 – When the slave I/O thread receives the event, it writes it to the end of the slave’s relay log.
5 – Once in the relay log, a slave SQL thread reads the event from the relay log and executes the event to apply the changes to the database on the slave. It is also responsible for coordinating with other MySQL threads to ensure changes do not interfere with other activities going on in the MySQL server.
MySQL Master-Slave Replication
Master-slave replication was the very first replication option for MySQL database. It assumes that you have a single Master server that accepts both reads and writes and one or more read-only Slave servers. Data from the master server are asynchronously replicated to Slave servers.
Pros
- It is very fast as doesn’t impose any restrictions on performance.
- You can split read and write requests to different servers. For example, all analytics queries can be made on Slave nodes.
Cons
- It is not very reliable because of asynchronous replication. It means that some committed on master transactions may be not available on slave if the master fails.
- Write requests can hardly be scaled. The only option to scale write requests is to increase compute capacity (RAM and CPU) of the Master node.
- Failover process is manual in a general case. You should take care of promotion replica node to master one.
MySQL Master-Master Replication
Master-master replication has been evolved from master-slave replication and solves its major issues. This type of replication assumes that you have two or more master nodes that can accept both read and write requests. In addition, you can have multiple slave nodes for each of your masters. The replication between master nodes is asynchronous.
Pros
- You have an option to scale write requests not only by increasing the computing capacity of a single master node but via adding additional master nodes.
- Failover semi-automatic because you have multiple master nodes. The chance that all master nodes fail simultaneously is very low. If any of master nodes fail, there is at least one more master node that can handle its requests.
Cons
- Due to asynchronous replication between all master nodes, you can lose some transaction in case one of the master nodes fail.
- Due to asynchronous replication, you can’t be sure that backups made on each master node contain the same data.
- Failover is still not fully automated in case you need to promote a Slave node to Master one.
MySQL MGR Replication
MySQL Group Replication feature is implemented by a MySQL Server plugin and provides replication in a completely new way based on distributed state machine architecture.
Group Replication allows creating fault-tolerant systems with redundancy that guarantees that even if some of the servers fail (as long as it is not a majority), the cluster still will be available. The unique feature of MGR replication is that it provides you with built-in automatic recovery and conflict resolution.
How MySQL Group Replication can help you:
- Removes the need for manual switching in case of failure of one of the servers.
- Provides fault tolerance.
- Allows you to build a system with the ability to change data on any server.
- Provides automatic reconfiguration.
Pros
- Automatic failover in case the Master node fails. Servers that belong to the same Group will elect new Master automatically. Therefore, MySQL MGR replication can be considered as a Clustered solution.
- It is synchronous replication in general, so you can be confident that you will not lose committed data in case of Master node fails.
- You can scale both reads and writes by adding new Master and Slave nodes.
- It is doesn’t impose big performance restrictions because it is enough to have only 2 Master nodes for a full-fledged MySQL cluster.
Cons
- It is available only for MySQL, but not for its forks: MariaDB and Percona.
- One Group is limited to 9 nodes.