发布时间:2022-01-18 16:51:28
数据库架构:一主两从 master: slave1: slave2
/sbin/ifconfig enp0s3:1
enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet  netmask  broadcast
        inet6 fe80::5198:593b:cdc5:1f90  prefixlen 64  scopeid 0x20<link>
        ether 08:00:27:c0:45:0d  txqueuelen 1000  (Ethernet)
        RX packets 72386  bytes 9442794 (9.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 24221  bytes 2963104 (2.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
enp0s3:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet  netmask  broadcast
        ether 08:00:27:c0:45:0d  txqueuelen 1000  (Ethernet)
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet  netmask
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 84  bytes 9492 (9.2 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 84  bytes 9492 (9.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
virbr0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet  netmask  broadcast
        ether 52:54:00:f4:55:bb  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
二、failover 自动切换测试
mysqladmin -uroot -pmysql shutdown
[root@manager bin]# tail -f /var/log/masterha/app1/manager.log
Thu Oct 25 19:34:53 2018 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Thu Oct 25 19:34:53 2018 - [info] Executing SSH check script: exit 0
Thu Oct 25 19:34:53 2018 - [info] Executing secondary network check script: /usr/local/bin/masterha_secondary_check -s -s  --user=root  --master_host=  --master_ip=  --master_port=3306 --master_user=root --master_password=mysql --ping_type=SELECT
Thu Oct 25 19:34:53 2018 - [info] HealthCheck: SSH to is reachable.
Monitoring server is reachable, Master is not reachable from OK.
Thu Oct 25 19:34:54 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '' (111))
Thu Oct 25 19:34:54 2018 - [warning] Connection failed 2 time(s)..
Monitoring server is reachable, Master is not reachable from OK.
Thu Oct 25 19:34:54 2018 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Thu Oct 25 19:34:55 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '' (111))
Thu Oct 25 19:34:55 2018 - [warning] Connection failed 3 time(s)..
Thu Oct 25 19:34:56 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '' (111))
Thu Oct 25 19:34:56 2018 - [warning] Connection failed 4 time(s)..
Thu Oct 25 19:34:56 2018 - [warning] Master is not reachable from health checker!
Thu Oct 25 19:34:56 2018 - [warning] Master is not reachable!
Thu Oct 25 19:34:56 2018 - [warning] SSH is reachable.
Thu Oct 25 19:34:56 2018 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Thu Oct 25 19:34:56 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Oct 25 19:34:56 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Oct 25 19:34:56 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Oct 25 19:34:57 2018 - [info] GTID failover mode = 1
Thu Oct 25 19:34:57 2018 - [info] Dead Servers:
Thu Oct 25 19:34:57 2018 - [info]
Thu Oct 25 19:34:57 2018 - [info] Alive Servers:
Thu Oct 25 19:34:57 2018 - [info]
Thu Oct 25 19:34:57 2018 - [info]
Thu Oct 25 19:34:57 2018 - [info] Alive Slaves:
Thu Oct 25 19:34:57 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:57 2018 - [info]     GTID ON
Thu Oct 25 19:34:57 2018 - [info]     Replicating from
Thu Oct 25 19:34:57 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:57 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:57 2018 - [info]     GTID ON
Thu Oct 25 19:34:57 2018 - [info]     Replicating from
Thu Oct 25 19:34:57 2018 - [info] Checking slave configurations..
Thu Oct 25 19:34:57 2018 - [info] Checking replication filtering settings..
Thu Oct 25 19:34:57 2018 - [info]  Replication filtering check ok.
Thu Oct 25 19:34:57 2018 - [info] Master is down!
Thu Oct 25 19:34:57 2018 - [info] Terminating monitoring script.
Thu Oct 25 19:34:57 2018 - [info] Got exit code 20 (Master dead).
Thu Oct 25 19:34:57 2018 - [info] MHA::MasterFailover version 0.58.
Thu Oct 25 19:34:57 2018 - [info] Starting master failover.
Thu Oct 25 19:34:57 2018 - [info]
Thu Oct 25 19:34:57 2018 - [info] * Phase 1: Configuration Check Phase..
Thu Oct 25 19:34:57 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] GTID failover mode = 1
Thu Oct 25 19:34:58 2018 - [info] Dead Servers:
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] Checking master reachability via MySQL(double check)...
Thu Oct 25 19:34:58 2018 - [info]  ok.
Thu Oct 25 19:34:58 2018 - [info] Alive Servers:
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] Alive Slaves:
Thu Oct 25 19:34:58 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info]     GTID ON
Thu Oct 25 19:34:58 2018 - [info]     Replicating from
Thu Oct 25 19:34:58 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:58 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info]     GTID ON
Thu Oct 25 19:34:58 2018 - [info]     Replicating from
Thu Oct 25 19:34:58 2018 - [info] Starting GTID based failover.
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Oct 25 19:34:58 2018 - [info] Executing master IP deactivation script:
Thu Oct 25 19:34:58 2018 - [info]   /usr/local/bin/master_ip_failover --orig_master_host= --orig_master_ip= --orig_master_port=3306 --command=stopssh --ssh_user=root  
Thu Oct 25 19:34:58 2018 - [info]  done.
Thu Oct 25 19:34:58 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Oct 25 19:34:58 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 3: Master Recovery Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] The latest binary log file/position on all slaves is mysql-bin.000010:707
Thu Oct 25 19:34:58 2018 - [info] Retrieved Gtid Set: a92f70a4-d5ea-11e8-af28-080027c0450d:7-9
Thu Oct 25 19:34:58 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Oct 25 19:34:58 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info]     GTID ON
Thu Oct 25 19:34:58 2018 - [info]     Replicating from
Thu Oct 25 19:34:58 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:58 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info]     GTID ON
Thu Oct 25 19:34:58 2018 - [info]     Replicating from
Thu Oct 25 19:34:58 2018 - [info] The oldest binary log file/position on all slaves is mysql-bin.000010:707
Thu Oct 25 19:34:58 2018 - [info] Retrieved Gtid Set: a92f70a4-d5ea-11e8-af28-080027c0450d:7-9
Thu Oct 25 19:34:58 2018 - [info] Oldest slaves:
Thu Oct 25 19:34:58 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info]     GTID ON
Thu Oct 25 19:34:58 2018 - [info]     Replicating from
Thu Oct 25 19:34:58 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:58 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info]     GTID ON
Thu Oct 25 19:34:58 2018 - [info]     Replicating from
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 3.3: Determining New Master Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] Searching new master from slaves..
Thu Oct 25 19:34:58 2018 - [info]  Candidate masters from the configuration file:
Thu Oct 25 19:34:58 2018 - [info]  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Thu Oct 25 19:34:58 2018 - [info]     GTID ON
Thu Oct 25 19:34:58 2018 - [info]     Replicating from
Thu Oct 25 19:34:58 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Thu Oct 25 19:34:58 2018 - [info]  Non-candidate masters:
Thu Oct 25 19:34:58 2018 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Thu Oct 25 19:34:58 2018 - [info] New master is
Thu Oct 25 19:34:58 2018 - [info] Starting master failover..
Thu Oct 25 19:34:58 2018 - [info]
From: (current master)
To: (new master)
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 3.3: New Master Recovery Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info]  Waiting all logs to be applied..
Thu Oct 25 19:34:58 2018 - [info]   done.
Thu Oct 25 19:34:58 2018 - [info] Getting new master's binlog name and position..
Thu Oct 25 19:34:58 2018 - [info]  mysql-bin.000010:747
Thu Oct 25 19:34:58 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu Oct 25 19:34:58 2018 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000010, 747, a92f70a4-d5ea-11e8-af28-080027c0450d:1-9,
Thu Oct 25 19:34:58 2018 - [info] Executing master IP activate script:
Thu Oct 25 19:34:58 2018 - [info]   /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host= --orig_master_ip= --orig_master_port=3306 --new_master_host= --new_master_ip= --new_master_port=3306 --new_master_user='root'   --new_master_password=xxx
Undefined subroutine &main::FIXME_xxx_create_user called at /usr/local/bin/master_ip_failover line 94.
Set read_only=0 on the new master.
Creating app user on the new master..
Thu Oct 25 19:34:58 2018 - [error][/usr/lib/perl5/vendor_perl/MHA/MasterFailover.pm, ln1612]  Failed to activate master IP address for with return code 10:0
Thu Oct 25 19:34:58 2018 - [warning] Proceeding.
Thu Oct 25 19:34:58 2018 - [info] ** Finished master recovery successfully.
Thu Oct 25 19:34:58 2018 - [info] * Phase 3: Master Recovery Phase completed.
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 4: Slaves Recovery Phase..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] * Phase 4.1: Starting Slaves in parallel..
Thu Oct 25 19:34:58 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info] -- Slave recovery on host started, pid: 20757. Check tmp log /var/log/masterha/app1/ if it takes time..
Thu Oct 25 19:34:59 2018 - [info]
Thu Oct 25 19:34:59 2018 - [info] Log messages from ...
Thu Oct 25 19:34:59 2018 - [info]
Thu Oct 25 19:34:58 2018 - [info]  Resetting slave and starting replication from the new master
Thu Oct 25 19:34:58 2018 - [info]  Executed CHANGE MASTER.
Thu Oct 25 19:34:58 2018 - [info]  Slave started.
Thu Oct 25 19:34:58 2018 - [info]  gtid_wait(a92f70a4-d5ea-11e8-af28-080027c0450d:1-9,
a92f70a4-d5ea-11e8-af28-080027c0450f:1-4) completed on Executed 0 events.
Thu Oct 25 19:34:59 2018 - [info] End of log messages from
Thu Oct 25 19:34:59 2018 - [info] -- Slave on host started.
Thu Oct 25 19:34:59 2018 - [info] All new slave servers recovered successfully.
Thu Oct 25 19:34:59 2018 - [info]
Thu Oct 25 19:34:59 2018 - [info] * Phase 5: New master cleanup phase..
Thu Oct 25 19:34:59 2018 - [info]
Thu Oct 25 19:34:59 2018 - [info] Resetting slave info on the new master..
Thu Oct 25 19:35:00 2018 - [info] Resetting slave info succeeded.
Thu Oct 25 19:35:00 2018 - [info] Master failover to completed successfully.
Thu Oct 25 19:35:00 2018 - [info] Deleted server1 entry from /etc/masterha/app1.cnf .
Thu Oct 25 19:35:00 2018 - [info]
----- Failover Report -----
app1: MySQL Master failover to succeeded
Master is down!
Check MHA Manager logs at manager:/var/log/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on
Selected as a new master. OK: Applying all logs succeeded.
Failed to activate master IP address for with return code 10:0 OK: Slave started, replicating from Resetting slave info succeeded.
Master failover to completed successfully.
Thu Oct 25 19:35:00 2018 - [info] Sending mail..
mysql> show slave status G
Empty set (0.00 sec)
mysql> show master status G
*************************** 1. row ***************************
             File: mysql-bin.000010
         Position: 747
Executed_Gtid_Set: a92f70a4-d5ea-11e8-af28-080027c0450d:1-9,
1 row in set (0.00 sec)
mysql> show variables like 'read_only';
| Variable_name | Value |
| read_only     | OFF   |
mysql> show slave status G
*************************** 1. row ***************************


