Replicating from AWS RDS MySQL to an external slave (without downtime)

We’ve recently needed to create an external copy of a large database running on Amazon RDS, with minimal or no downtime. The database is a backend to a busy site and our goal was to create a replica in our data centre without causing any disruptions to our users. With Amazon adding support for MySQL 5.6 this meant that we’re able to access the binary logs from an external location, which wasn’t possible before.

As MySQL replication only works from a lower version to an equal or higher version, we had to ensure that both our databases were on MySQL 5.6. This was simple with regards to the external slave but not as easy with the RDS instance, which was on MySQL 5.1. Upgrading the RDS instance would require a reboot after upgrading to each version i.e. 5.1 ->  5.5 -> 5.6. As per the recommendation in the Amazon upgrade guide we created a read replica and upgraded it to 5.6. With the replica synced up, we needed to enable automated backups before it was in a state where it could be used as a replication source.

Creating an initial database dump proved tricky, as the actual time to create the backup was around 40-50 minutes. The import time into the external slave was around 3-4 hours and with the site being as active as it is, the binary log and position changes pretty quickly. The best option would be to stop the RDS slave while the backup is happening. Due to the permissions given to the ‘master’ user by Amazon, running a STOP SLAVE command would return a

ERROR 1045 (28000): Access denied for user ‘admin’@’%’ (using password: YES)

Luckily there’s a stored procedure which can be used to stop replication –mysql.rds_stop_replication

mysql> CALL mysql.rds_stop_replication;
+—————————+
| Message |
+—————————+
| Slave is down or disabled |
+—————————+
1 row in set (1.08 sec)

Query OK, 0 rows affected (1.08 sec)

With replication on the RDS slave stopped, we can start creating the backup assured that no changes will be made during the process and any locking of tables won’t affect any users browsing the website.

Once the backup completes, we’d want to start up replication again but before doing this we’ll be able to get the binlog file log and position:

mysql> show master status;
+—————————-+———-+————–+——————+——————-+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+—————————-+———-+————–+——————+——————-+
| mysql-bin-changelog.074036 | 11653042 | | | |
+—————————-+———-+————–+——————+——————-+

This will be required when setting up the external slave later on. Now that we have the relevant information we can start the replication.Again, we’d need to use the RDS mapping of START SLAVE:

mysql> CALL mysql.rds_start_replication;
+————————-+
| Message |
+————————-+
| Slave running normally. |
+————————-+

 

Once the dump has been imported we can set the the new master on the external slave with the values previously recorded:

CHANGE MASTER TO MASTER_HOST=’AWS_RDS_SLAVE’, MASTER_PASSWORD=’SOMEPASS’, MASTER_USER=’REPL_USER’, MASTER_LOG_FILE=’mysql-bin-changelog.074036′, MASTER_LOG_POS=11653042;

Before we start the replication, we need to add a few more settings to the external slave’s my.cnf:

  • a unique server-id i.e. one that’s not being used by any of the other mysql DBs
  • the database(s) you want to replicate with replicate-do-db. This stops the slave trying to replicate the mysql table and other potential RDS related stuff. Thanks to Phil for picking that up.

So something like:

server-id = 739472
replicate-do-db=myreplicateddb
replicate-do-db=mysecondreplicateddb (if more than one db needs to be replicated)

Start up replication on the external slave – START SLAVE; 

This should start updating the slave, which you can monitor via

mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Log_File: mysql-bin-changelog.075782
Read_Master_Log_Pos: 113973
Relay_Log_File: relaylog.014354
Relay_Log_Pos: 114089
Relay_Master_Log_File: mysql-bin-changelog.074036
Slave_SQL_Running_State: updating

The above values are the most important from the sea of information that the command returns. You’ll be waiting for MASTER_LOG_FILE and Relay_Master_Log_File to be identical and Slave_SQL_Running_State having a status of Slave has read all relay log; waiting for the slave I/O thread to update it

Once that syncs up, an external replica has been created with zero downtime!

Leave a Reply