官网: https://github.com/yoshinorim/mha4mysql-manager/wiki

前提:
    1) 所有节点之间, 相互之间免密认证
    2) 所有节点yum源配置正常
    3) gw服务器有相关的安装包
    4) 系统时间要正确
    5) /etc/hosts要统一(本实验通过dns解析已经实现)

环境拓扑:
node01  10.15.200.101  Primary_Master   ServerID=101  主(读写)
node02  10.15.200.102  Candidate_Master ServerID=102  从(读)
node03  10.15.200.103  slave01          ServerID=103  从(读)
node05  10.15.200.105  mha              ServerID=105  高可用监控

node01为主服务器 node02为从服务器 node03为从服务器
当主服务器:node01故障时 node02由从提升为主 node03将主服务器重新指向node02(原为node01)

# 1. 所有节点(node01 node02 node03 node05) 清空之前实验的相关数据  (删除之前实验环境数据 清空yum旧缓存 更新yum新缓存
# 或是 直接还原快照
yum remove Percona-XtraDB-Cluster* httpd* php* Percona-Server* mariadb* -y
yum remove mysql-community-server -y
rm -fr /var/log/mysqld.log /var/lib/mysql/*
yum clean all && yum makecache

# 2. 在node01 node02 node03 安装mysql-community-5.7
yum -y install mysql-community-server

 

# 数据库配置文件已经放在gw (如果没有 请自行上传)

node01: wget http://10.15.200.8/files/mha/node01.my.cnf -O /etc/my.cnf
node02: wget http://10.15.200.8/files/mha/node02.my.cnf -O /etc/my.cnf
node03: wget http://10.15.200.8/files/mha/node03.my.cnf -O /etc/my.cnf
或者是这样的一条命令:
wget http://10.15.200.8/files/mha/`hostname -s`.my.cnf -O /etc/my.cnf


# 3. 在所有节点 安装MHA (node01 node02 node03 node05)
yum install mha4mysql-node-0.58 -y

 

主节点: node01 安装数据库 启动服务  配置用户名和密码
[root@node01 ~]# rm -fr /var/lib/mysql/* && mysqld --initialize && chown -R mysql:mysql /var/lib/mysql
[root@node01 ~]# systemctl start mysqld
[root@node01 ~]# grep 'temporary password' /var/log/mysqld.log | awk '{print $11}'
Fc-8L0tu?rMh

# 修改root密码 创建用户复制用户repl及mha管理用户mha
[root@node01 ~]# mysql -uroot -p'Fc-8L0tu?rMh'

ALTER USER 'root'@'localhost' IDENTIFIED BY '!@#qweASD69';
grant replication slave on *.*  to repl@'10.15.200.%' identified by '!@#qweASD69';
grant all on *.* to mha@'10.15.200.%' identified by '!@#qweASD69';
flush privileges;

因为要做主从实验,所以各个服务器的server-id不能相同,我们需要在这里进行查看,确保没有问题

 

 

node02: 同理类似
[root@node02 ~]# rm -fr /var/lib/mysql/* && mysqld --initialize --user=mysql
[root@node02 ~]# systemctl start mysqld
[root@node02 ~]# grep 'temporary password' /var/log/mysqld.log | awk '{print $11}'
G:aFM:4tkn4y
[root@node02 ~]# mysql -uroot -p'G:aFM:4tkn4y'

ALTER USER 'root'@'localhost' IDENTIFIED BY '!@#qweASD69';
grant replication slave on *.*  to repl@'10.15.200.%' identified by '!@#qweASD69';
grant all on *.* to mha@'10.15.200.%' identified by '!@#qweASD69';
flush privileges;

node02上配置从服务器:
CHANGE MASTER TO
MASTER_HOST='10.15.200.101',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1,
MASTER_USER='repl',
MASTER_PASSWORD='!@#qweASD69';

mysql> show slave status\G;
             Slave_IO_Running: No
            Slave_SQL_Running: No

mysql> start slave;

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.15.200.101
                  Master_User: repl
                  Master_Port: 3306
            重点是这里,得有


node03: 同理类似
[root@node03 ~]# rm -fr /var/lib/mysql/* && mysqld --initialize --user=mysql
[root@node03 ~]# systemctl start mysqld
[root@node03 ~]# grep 'temporary password' /var/log/mysqld.log | awk '{print $11}'
>uLMJdwI/1Tl
[root@node03 ~]# mysql -uroot -p'>uLMJdwI/1Tl'

ALTER USER 'root'@'localhost' IDENTIFIED BY '!@#qweASD69';
grant replication slave on *.*  to repl@'10.15.200.%' identified by '!@#qweASD69';
grant all on *.* to mha@'10.15.200.%' identified by '!@#qweASD69';
flush privileges;

node03上配置从服务器:
CHANGE MASTER TO
MASTER_HOST='10.15.200.101',
MASTER_PORT=3306,
MASTER_AUTO_POSITION=1,
MASTER_USER='repl',
MASTER_PASSWORD='!@#qweASD69';

mysql> show slave status\G;
             Slave_IO_Running: No
            Slave_SQL_Running: No

mysql> start slave;

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.15.200.101
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60


# ping虚拟ip 无法ping通
~$ ping 10.15.200.118
PING 10.15.200.118 (10.15.200.118): 56 data bytes
Request timeout for icmp_seq 0

node01: 先在主库master上绑定VIP(只需手工绑定一次,后续脚本会自动切换)
[root@node01 ~]# ifconfig ens33:1 10.15.200.118/24      # 注意如果是ens32做对应的修改

[root@node01 ~]# ip addr
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    inet 10.15.200.118/24 brd 10.15.200.255 scope global secondary ens33:1
       valid_lft forever preferred_lft forever
       
~$ ping 10.15.200.118 -c2
PING 10.15.200.118 (10.15.200.118): 56 data bytes
64 bytes from 10.15.200.118: icmp_seq=0 ttl=64 time=1.431 ms

# node05安装管理节点
yum install mha4mysql-node-0.58 mha4mysql-manager-0.58 -y
mkdir -p /etc/mha/scripts
wget http://10.15.200.8/files/mha/master_ip_failover -O /etc/mha/scripts/master_ip_failover
wget http://10.15.200.8/files/mha/master_ip_online_change -O /etc/mha/scripts/master_ip_online_change
wget http://10.15.200.8/files/mha/send_report -O /etc/mha/scripts/send_report
chmod 744 /etc/mha/scripts/*

wget http://10.15.200.8/files/mha/app.cnf -O /etc/mha/app.cnf
wget http://10.15.200.8/files/mha/masterha_default.cnf -O /etc/masterha_default.cnf

node05:
/etc/mha/scripts/master_ip_failover           # 修改网址的名字  ens32 ens34 ens33
/etc/mha/scripts/master_ip_online_change    # 修改网址的名字  ens32 ens34 ens33

node05赋予脚本执行权限:
[root@node05 ~]# chmod 744 /etc/mha/scripts/*

# 用 masterha_check_ssh 命令检查 ssh 互信是否成功
[root@node05 ~]# masterha_check_ssh --conf=/etc/mha/app.cnf
Tue Mar  2 23:08:01 2021 - [info] All SSH connection tests passed successfully.

# 使用 masterha_check_repl 命令检查 mysql 主从是否正常
[root@node05 ~]# masterha_check_repl --conf=/etc/mha/app.cnf
MySQL Replication Health is OK.


启动MHA: 如下命令 会卡住不动
[root@node05 ~]# masterha_manager --conf=/etc/mha/app.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app/manager.log 2>&1

通过日志检查MHA是否启动成功:
[root@node05 ~]# tailf /var/log/mha/app/manager.log

Checking the Status of the script.. OK
Tue Mar  2 23:15:30 2021 - [info]  OK.
Tue Mar  2 23:15:30 2021 - [warning] shutdown_script is not defined.
Tue Mar  2 23:15:30 2021 - [info] Set master ping interval 1 seconds.
Tue Mar  2 23:15:30 2021 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s node01 -s node02 -s node03
Tue Mar  2 23:15:30 2021 - [info] Starting ping health check on node01(10.15.200.101:3306)..
Tue Mar  2 23:15:30 2021 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
# 最后一行出现如下字样表明启动成功
Tue Mar  2 23:15:30 2021 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

Mon Sep 20 23:43:27 2021 - [info] Got exit code 1 (Not master dead).
检查MHA集群状态:
[root@node05 ~]# masterha_check_status --conf=/etc/mha/app.cnf
app (pid:3412) is running(0:PING_OK), master:node01

将node01的mysql服务停止:
[root@node01 ~]# systemctl stop mysqld

node05动态的查看日志:
[root@node05 scripts]# tailf /var/log/mha/app/manager.log
IN SCRIPT TEST====/sbin/ifconfig ens33:1 down==/sbin/ifconfig ens33:1 10.15.200.118/24===
Enabling the VIP - 10.15.200.118/24 on the new master - node02
Warning: Permanently added 'node02,10.15.200.102' (ECDSA) to the list of known hosts.
Tue Mar  2 23:21:56 2021 - [info]  OK.
Tue Mar  2 23:21:56 2021 - [info] ** Finished master recovery successfully.
Tue Mar  2 23:21:56 2021 - [info] * Phase 3: Master Recovery Phase completed.
Tue Mar  2 23:21:56 2021 - [info]
Tue Mar  2 23:21:56 2021 - [info] * Phase 4: Slaves Recovery Phase..
Tue Mar  2 23:21:56 2021 - [info]
Tue Mar  2 23:21:56 2021 - [info]
Tue Mar  2 23:21:56 2021 - [info] * Phase 4.1: Starting Slaves in parallel..
Tue Mar  2 23:21:56 2021 - [info]
Tue Mar  2 23:21:56 2021 - [info] -- Slave recovery on host node03(10.15.200.103:3306) started, pid: 3884. Check tmp log /var/log/mha/app/node03_3306_20210302232154.log if it takes time..
Tue Mar  2 23:21:58 2021 - [info]
Tue Mar  2 23:21:58 2021 - [info] Log messages from node03 ...
Tue Mar  2 23:21:58 2021 - [info]
Tue Mar  2 23:21:56 2021 - [info]  Resetting slave node03(10.15.200.103:3306) and starting replication from the new master node02(10.15.200.102:3306)..
Tue Mar  2 23:21:56 2021 - [info]  Executed CHANGE MASTER.
Tue Mar  2 23:21:57 2021 - [info]  Slave started.
Tue Mar  2 23:21:57 2021 - [info]  gtid_wait(bfb0229c-7b5b-11eb-9393-000c29edf7ca:1-6,
eec2d10b-7b5e-11eb-afcc-000c2930884c:1-3) completed on node03(10.15.200.103:3306). Executed 0 events.
Tue Mar  2 23:21:58 2021 - [info] End of log messages from node03.
Tue Mar  2 23:21:58 2021 - [info] -- Slave on host node03(10.15.200.103:3306) started.
Tue Mar  2 23:21:58 2021 - [info] All new slave servers recovered successfully.
Tue Mar  2 23:21:58 2021 - [info]
Tue Mar  2 23:21:58 2021 - [info] * Phase 5: New master cleanup phase..
Tue Mar  2 23:21:58 2021 - [info]
Tue Mar  2 23:21:58 2021 - [info] Resetting slave info on the new master..
Tue Mar  2 23:21:58 2021 - [info]  node02: Resetting slave info succeeded.
Tue Mar  2 23:21:58 2021 - [info] Master failover to node02(10.15.200.102:3306) completed successfully.
Tue Mar  2 23:21:58 2021 - [info] Deleted server1 entry from /etc/mha/app.cnf .
Tue Mar  2 23:21:58 2021 - [info]

----- Failover Report -----

app: MySQL Master failover node01(10.15.200.101:3306) to node02(10.15.200.102:3306) succeeded

Master node01(10.15.200.101:3306) is down!

Check MHA Manager logs at node05.example.cn:/var/log/mha/app/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on node01(10.15.200.101:3306)
Selected node02(10.15.200.102:3306) as a new master.
node02(10.15.200.102:3306): OK: Applying all logs succeeded.
node02(10.15.200.102:3306): OK: Activated master IP address.
node03(10.15.200.103:3306): OK: Slave started, replicating from node02(10.15.200.102:3306)
node02(10.15.200.102:3306): Resetting slave info succeeded.
Master failover to node02(10.15.200.102:3306) completed successfully.


# 虚拟IP已经漂移到node02  node02由从的角色 提升为主的角色
[root@node02 mysql]# ip addr
    inet 10.15.200.118/24 brd 10.15.200.255 scope global secondary ens33:1
       valid_lft forever preferred_lft forever

# 此时发现 node03的主服务器由node01变为node02
[root@node03 ~]# mysql -uroot -p'!@#qweASD' -e 'show slave status\G;'
mysql: [Warning] Using a password on the command line interface can be insecure.
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.15.200.102
                  
此时再次查看集群的状态 为失效的状态:
[root@node05 ~]# masterha_check_status --conf=/etc/mha/app.cnf
app is stopped(2:NOT_RUNNING).

[root@node05 ~]# cat /etc/mha/app.cnf    # 此文件中的[server1]段内容已经被删除

注意事项:
    1) master端同样要开启两个重要的选项,server-id和log-bin
        并且server-id在全局架构中并且唯一,不能被其它主机使用
        slave端要开启relay-log

https://github.com/yoshinorim/mha4mysql-manager/wiki

ERROR 1819 (HY000): Your password does not satisfy the current policy requirements
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement
 

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐