If you have followed the instructions but your replication setup is not working, the first thing to do is check the error log for messages. Many users have lost time by not doing this soon enough after encountering problems.
If you cannot tell from the error log what the problem was, try the following techniques:
Verify that the master has binary logging enabled by issuing a
SHOW MASTER STATUS
statement. Binary logging is enabled by default. If binary logging is enabled,Position
is nonzero. If binary logging is not enabled, verify that you are not running the master with any settings that disable binary logging, such as the--skip-log-bin
option.Verify that the master and slave both were started with the
--server-id
option and that the ID value is unique on each server.Verify that the slave is running. Use
SHOW SLAVE STATUS
to check whether theSlave_IO_Running
andSlave_SQL_Running
values are bothYes
. If not, verify the options that were used when starting the slave server. For example,--skip-slave-start
prevents the slave threads from starting until you issue aSTART SLAVE
statement.If the slave is running, check whether it established a connection to the master. Use
SHOW PROCESSLIST
, find the I/O and SQL threads and check theirState
column to see what they display. See Section 17.2.2, “Replication Implementation Details”. If the I/O thread state saysConnecting to master
, check the following:Verify the privileges for the user being used for replication on the master.
Check that the host name of the master is correct and that you are using the correct port to connect to the master. The port used for replication is the same as used for client network communication (the default is
3306
). For the host name, ensure that the name resolves to the correct IP address.Check that networking has not been disabled on the master or slave. Look for the
skip-networking
option in the configuration file. If present, comment it out or remove it.If the master has a firewall or IP filtering configuration, ensure that the network port being used for MySQL is not being filtered.
Check that you can reach the master by using
ping
ortraceroute
/tracert
to reach the host.
If the slave was running previously but has stopped, the reason usually is that some statement that succeeded on the master failed on the slave. This should never happen if you have taken a proper snapshot of the master, and never modified the data on the slave outside of the slave thread. If the slave stops unexpectedly, it is a bug or you have encountered one of the known replication limitations described in Section 17.4.1, “Replication Features and Issues”. If it is a bug, see Section 17.4.5, “How to Report Replication Bugs or Problems”, for instructions on how to report it.
If a statement that succeeded on the master refuses to run on the slave, try the following procedure if it is not feasible to do a full database resynchronization by deleting the slave's databases and copying a new snapshot from the master:
Determine whether the affected table on the slave is different from the master table. Try to understand how this happened. Then make the slave's table identical to the master's and run
START SLAVE
.If the preceding step does not work or does not apply, try to understand whether it would be safe to make the update manually (if needed) and then ignore the next statement from the master.
If you decide that the slave can skip the next statement from the master, issue the following statements:
mysql> SET GLOBAL sql_slave_skip_counter = N; mysql> START SLAVE;
The value of
N
should be 1 if the next statement from the master does not useAUTO_INCREMENT
orLAST_INSERT_ID()
. Otherwise, the value should be 2. The reason for using a value of 2 for statements that useAUTO_INCREMENT
orLAST_INSERT_ID()
is that they take two events in the binary log of the master.See also Section 13.4.2.5, “SET GLOBAL sql_slave_skip_counter Syntax”.
If you are sure that the slave started out perfectly synchronized with the master, and that no one has updated the tables involved outside of the slave thread, then presumably the discrepancy is the result of a bug. If you are running the most recent version of MySQL, please report the problem. If you are running an older version, try upgrading to the latest production release to determine whether the problem persists.