Centos 7 部署 OpenStack_Rocky版高可用集群3-1

Odele ·

更新时间:2024-09-20

· 503 次阅读

Centos 7 部署 OpenStack_Rocky版高可用集群3-1 文章目录Centos 7 部署 OpenStack_Rocky版高可用集群3-11、前期准备1.1 软硬件准备1.1.1组件1.1.2 网络拓扑（逻辑）1.1.3 物理拓扑（部分参考）1.2 配置hosts文件(集群所有电脑配置)1.3 节点间设置免密互相登陆1.4 配置NTP服务1.4.1 在控制节点进行配置（cont01、cont02、cont03）1.4.2 在其他节点上安装NTP软件（mon01、comp01、comp02、comp03）1.5 系统初始设置1.5.1 所有节点关闭防火墙和selinux2、部署MariaDB Galeria集群2.1 控制节点安装mariadb(cont01、cont02、cont03)2.1.1 控制节点添加mariadb-10.4.11的yum源2.1.2 控制节点yum安装mariadb软件包并开启服务2.1.3 控制节点数据库安全设置、登陆数据库设置密码并授权2.1.4 配置MariaDB Galeria 集群 (控制节点全修改配置文件/etc/my.cnf.d/server.cnf)2.1.5 控制节点停止mariadb.service2.1.6 启动初始化集群2.2 登录数据库，查看集群状态2.3 每个控制节点/usr/bin 目录下面编辑心跳检测clustercheck脚本3、部署xinetd服务3.1 控制节点安装xinetd服务，用于监控Galera服务(cont01、cont02、cont03)3.2 控制节点创建xinetd配置文件3.3 控制节点修改/etc/services3.4 控制节点启动xinetd服务(cont01、cont02、cont03)3.5 故障场景的恢复4、部署rabbitmq消息队列4.1、所有控制节点安装rabbitmq和erlang软件包（cont01、cont02、cont03、~~mon01~~）4.2 查看集群状态4.3 编辑rabbitmq变量文件4.4 查看rabbitmq插件4.5 所有node开启rabbitmq的web管理页面4.6 查看rabbitmq插件是否加载成功（成功后前面有E）4.7 访问：http://192.168.10.22:15672/4.8 所有node修改guest密码为adminrabbitmq(默认用户为：guest 密码为：guest)4.9 cont02添加一个openstack的用户，并设密码为adminopenstack。并设置权限成为管理员(成集群后，cont02和cont03上也可以登录)4.10 发送cont01上的erlang.cookie到其他节点配置集群4.11 cont01、cont03、mon01停止应用，并ram的方式加入cont01节点，之后重启应用（先查看cont01集群名字）4.12 检测集群状态4.13 为了保持rabbitmq的消息一致性，可以设置消息镜像模式策略所有节点执行4.14 RabbitMQ注意小计（故障恢复提示）：5、部署PCS5.1 安装pcs+pacemaker+corosync （cont01、cont02和 cont03）5.2 设置集群密码，而且三个节点密码需一直为：pcs#typora5.3 控制节点创建配置文件corosync.conf5.4 配置集群，设置集群互相认证5.5 配置节点认证5.6 创建集群5.7 启动集群并查看集群状态5.8 检验corosync状态5.9 创建集群VIP （预设VIP地址为192.168.10.20）6、部署Haproxy6.1 所有控制节点安装Haproxy，并保持相同的配置，对于配置文件，建议根据实际情况进行优化。6.1.1安装并配置haproxy6.2 配置Haproxy能监控Galera数据库集群6.3 开启haproxy,并自启动（cont01、cont02、cont03）6.4 登陆http://192.168.10.20:8789/ 检测状态默认账户:admin 密码:admin7、部署Keystone7.1 控制节点安装Enable the OpenStack repository(cont01和cont02)7.2 控制节点安装 SQL database(cont01和cont02)7.3 安装Memcached (cont01、cont02和cont03)7.4 configure the Identity service7.5 Install and configure components7.6 Configure the Apache HTTP server [cont0$表示cont01、cont02和cont03]7.7 Create a domain, projects, users, and roles 1、前期准备 1.1 软硬件准备

控制主机三台（三网卡）+计算节点三台（双网卡、兼分布式存储OSD）+存储节点一台（双网卡部署MOD）

1.1.1组件

组件	版本	备注
CentOS	CentOS Linux release 7.7.1908 (Core)
Openstack	Rocky
Ceph	ceph version 12.2.12 luminous (stable)

1.1.2 网络拓扑（逻辑）

在这里插入图片描述

1.控制节点(cont01、cont02、cont03)运行MariaDB galeria,Haproxy,pacemaker,corosync,keystone, glance,horizon,nova&neutron&cinder管理相关组件和openstack相关的基础服务

2.计算节点(comp01、comp02、comp03)运行Nova-compute,neutron-openswitch-agent,ceph-osd等

3.存储控制节点(mon01)运行ceph-mon,ceph-mgr(如生产环境，建议也采用三个节点部署)

4.控制节点网络：

管理网络–>eth0:192.168.10.0/24 用于对openstack集群host os,api等网络管理

租户隧道VM网络–>eth1:10.10.10.0/24 用于集群内部的guest os之间的数据交换，采用vxlan/gre等方式

外部External网络–>eth2:主要用户计算节点访问internet/外部的floating ip

ps:此处的管理网络合并Management网络+API网络

5.计算节点网络：

管理网络–>eth0:192.168.10.0/24 用于对openstack集群host os,api等网络管理，

租户隧道网络–>eth1:10.10.10.0/24 用于集群内部的guest os之间的数据交换，采用vxlan/gre等方式

6.前端采用Haproxy做高可用，数据库采用MariaDB galeria，存储采用ceph分布式存储集群

7.无状态的服务，如xxx-api，采取active/active的模式运行；有状态的服务，如neturon-xxx-agent，cinder-volume等，建议采取active/passive的模式运行（因前端采用haproxy，客户端的多次请求可能会被转发到不同的控制节点，如果客户端请求被负载到无状态信息的控制节点，可能会导致操作请求失败）；自身具有集群机制的服务，如rabbitmq，memcached等采用本身的集群机制。

1.1.3 物理拓扑（部分参考）

在这里插入图片描述

1.2 配置hosts文件(集群所有电脑配置)

[root@cont02:/root]# vim /etc/hosts
192.168.10.21   cont01
192.168.10.22   cont02
192.168.10.23	  cont03
192.168.10.20	  VirtualIP
192.168.10.24   mon01
192.168.10.19   comp01
192.168.10.18	  comp02
192.168.10.17	  comp03
[root@cont02:/root]# scp /etc/hosts cont02:/etc/
[root@cont02:/root]# scp /etc/hosts cont03:/etc/
[root@cont02:/root]# scp /etc/hosts mon01:/etc/
[root@cont02:/root]# scp /etc/hosts comp01:/etc/
[root@cont02:/root]# scp /etc/hosts comp02:/etc/
[root@cont02:/root]# scp /etc/hosts comp03:/etc/

1.3 节点间设置免密互相登陆

[root@cont01:/root]# ssh-keygen
ssh-copy-id cont02
ssh-copy-id cont03
ssh-copy-id mon01
ssh-copy-id comp01
ssh-copy-id comp02
ssh-copy-id comp03
[root@cont02:/root]# ssh-keygen
ssh-copy-id cont01
ssh-copy-id cont03
ssh-copy-id mon01
ssh-copy-id comp01
ssh-copy-id comp02
ssh-copy-id comp03
[root@cont03:/root]# ssh-keygen
ssh-copy-id cont01
ssh-copy-id cont02
ssh-copy-id mon01
ssh-copy-id comp01
ssh-copy-id comp02
ssh-copy-id comp03

1.4 配置NTP服务 1.4.1 在控制节点进行配置（cont01、cont02、cont03）

[root@cont02:/root]# yum install system-config-date -y
[root@cont02:/root]# vim /etc/chrony.conf
  3 server 0.centos.pool.ntp.org iburst
  4 server 192.168.10.21    iburst
  5 server 192.168.10.22    iburst
  6 server 192.168.10.23    iburst
 25 # Allow NTP client access from local network.
 26 allow 192.168.0.0/16
[root@cont02:/root]# scp /etc/chrony.conf cont01:/etc/chrony.conf
[root@cont02:/root]# scp /etc/chrony.conf cont03:/etc/chrony.conf  
[root@cont01:/root]# systemctl enable chronyd && systemctl restart chronyd
[root@cont01:/root]# systemctl status chronyd
[root@cont02:/root]# systemctl enable chronyd && systemctl restart chronyd
[root@cont02:/root]# systemctl status chronyd
[root@cont03:/root]# systemctl enable chronyd && systemctl restart chronyd
[root@cont03:/root]# systemctl status chronyd
[root@cont01:/root]# chronyc sources
210 Number of sources = 3
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? cont02                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^? cont03                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^* 119.28.206.193                2   6    37    21   +197us[+1847us] +/-   55ms
[root@cont02:/root]# chronyc sources
210 Number of sources = 3
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? cont01                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^? cont03                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^* 162.159.200.123               3   6    37    51  +2273us[ +143us] +/-  121ms
[root@cont03:/root]# chronyc sources
210 Number of sources = 3
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? cont01                        0   8     0     -     +0ns[   +0ns] +/-    0ns
^? cont02                        0   8     0     -     +0ns[   +0ns] +/-    0ns
^? tick.ntp.infomaniak.ch        0   8     0     -     +0ns[   +0ns] +/-    0ns
[root@cont02:/root]# ssh cont01 'date'; ssh cont03 'date' ; date
Tue Feb 11 20:32:16 CST 2020
Tue Feb 11 20:32:16 CST 2020
Tue Feb 11 20:32:16 CST 2020

1.4.2 在其他节点上安装NTP软件（mon01、comp01、comp02、comp03）

[root@mon01:/root]# yum install chrony -y
[root@comp01:/root]# yum install chrony -y
[root@comp02:/root]# yum install chrony -y
[root@comp03:/root]# yum install chrony -y
[root@mon01:/root]# vim /etc/chrony.conf
  3 server 192.168.10.22    iburst
  4 server 192.168.10.21    iburst
  5 server 192.168.10.23    iburst
  6 #server 3.centos.pool.ntp.org iburst
[root@comp01:/root]# vim /etc/chrony.conf
  3 server 192.168.10.22    iburst
  4 server 192.168.10.21    iburst
  5 server 192.168.10.23    iburst
  6 #server 3.centos.pool.ntp.org iburst
[root@comp02:/root]# vim /etc/chrony.conf
  3 server 192.168.10.22    iburst
  4 server 192.168.10.21    iburst
  5 server 192.168.10.23    iburst
  6 #server 3.centos.pool.ntp.org iburst
  [root@comp03:/root]# vim /etc/chrony.conf
  3 server 192.168.10.22    iburst
  4 server 192.168.10.21    iburst
  5 server 192.168.10.23    iburst
  6 #server 3.centos.pool.ntp.org iburst
[root@mon01:/root]# systemctl enable chronyd && systemctl restart chronyd
[root@mon01:/root]# systemctl status chronyd
[root@comp01:/root]# systemctl enable chronyd && systemctl restart chronyd
[root@comp01:/root]# systemctl status chronyd
[root@comp02:/root]# systemctl enable chronyd && systemctl restart chronyd
[root@comp02:/root]# systemctl status chronyd
[root@comp03:/root]# systemctl enable chronyd && systemctl restart chronyd
[root@comp03:/root]# systemctl status chronyd
[root@mon01:/root]# chronyc sources
210 Number of sources = 3
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? cont02                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^? cont01                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^? cont03                        0   7     0     -     +0ns[   +0ns] +/-    0ns
[root@mon01:/root]# timedatectl
      Local time: Wed 2020-01-15 11:22:23 CST
  Universal time: Wed 2020-01-15 03:22:23 UTC
        RTC time: Wed 2020-01-15 03:22:24
       Time zone: Asia/Shanghai (CST, +0800)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a
[root@comp01:/root]# chronyc sources
210 Number of sources = 3
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? cont02                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^? cont01                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^? cont03                        0   7     0     -     +0ns[   +0ns] +/-    0ns
[root@comp01:/root]# timedatectl
      Local time: Wed 2020-01-15 11:22:24 CST
  Universal time: Wed 2020-01-15 03:22:24 UTC
        RTC time: Wed 2020-01-15 03:22:24
       Time zone: Asia/Shanghai (CST, +0800)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a
[root@comp02:/root]# chronyc sources
210 Number of sources = 3
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? cont02                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^? cont01                        0   7     0     -     +0ns[   +0ns] +/-    0ns
^? cont03                        0   7     0     -     +0ns[   +0ns] +/-    0ns
[root@comp02:/root]# timedatectl
      Local time: Wed 2020-01-15 11:22:24 CST
  Universal time: Wed 2020-01-15 03:22:24 UTC
        RTC time: Wed 2020-01-15 03:22:24
       Time zone: Asia/Shanghai (CST, +0800)
     NTP enabled: yes
NTP synchronized: no
 RTC in local TZ: no
      DST active: n/a

1.5 系统初始设置 1.5.1 所有节点关闭防火墙和selinux

systemctl stop firewalld && systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=permissive/' /etc/selinux/config

1.5.2 所有节点编辑limits.conf和sysctl.conf文件

cat >>/etc/security/limits.conf<>/etc/sysctl.conf<<eof   
   fs.file-max=655350
   net.ipv4.ip_local_port_range = 1025 65000
   net.ipv4.tcp_tw_recycle = 1
   net.ipv4.ip_nonlocal_bind = 1
eof
sysctl -p

2、部署MariaDB Galeria集群 2.1 控制节点安装mariadb(cont01、cont02、cont03)

Galera是Galera Cluster，是一种为数据库设计的新型的、数据不共享的、高度冗余的高可用方案，galera mariadb就是集成了Galera插件的MariaDB集群，Galera本身是具有多主特性的，所以galera mariadb不是传统的主备模式的集群，而是多主节点架构。

2.1.1 控制节点添加mariadb-10.4.11的yum源

[root@cont01:/root]# vim /etc/yum.repos.d/mariadb.repo
[mariadb]
name = MariaDB
baseurl = http://mirror.netinch.com/pub/mariadb//mariadb-10.4.11/yum/centos7-amd64/
gpgcheck = 0
enabled = 1
[root@cont01:/root]# yum makecache
[root@cont02:/root]# 
cat > /etc/yum.repos.d/mariadb.repo << EOF
[mariadb]
name = MariaDB
baseurl = http://mirror.netinch.com/pub/mariadb//mariadb-10.4.11/yum/centos7-amd64/
gpgcheck = 0
enabled = 1
EOF
[root@cont02:/root]# yum makecache
[root@cont02:/root]# scp /etc/yum.repos.d/mariadb.repo cont03:/etc/yum.repos.d/
[root@cont03:/root]# yum makecache

2.1.2 控制节点yum安装mariadb软件包并开启服务

[root@cont01:/root]# yum -y install MariaDB-server MariaDB-client galera-4 rsync xinetd wget ntpdate
[root@cont01:/root]# systemctl enable mariadb && systemctl start mariadb
[root@cont01:/root]# systemctl status mariadb
[root@cont02:/root]# yum -y install MariaDB-server MariaDB-client galera-4 rsync xinetd wget ntpdate
[root@cont02:/root]# systemctl enable mariadb && systemctl start mariadb
[root@cont02:/root]# systemctl status mariadb
[root@cont03:/root]# yum -y install MariaDB-server MariaDB-client galera-4 rsync xinetd wget ntpdate
[root@cont03:/root]# systemctl enable mariadb && systemctl start mariadb
[root@cont03:/root]# systemctl status mariadb

2.1.3 控制节点数据库安全设置、登陆数据库设置密码并授权


[root@cont02:/root]# mysql_secure_installation
注：Disallow root login remotely? [Y/n] n
密码：typora#2019
[root@cont02:/root]# mysql
MariaDB [(none)]> SET password=PASSWORD('typora#2019');
//密码设置过的可以省略
Query OK, 0 rows affected (0.002 sec)
MariaDB [(none)]> grant all privileges on *.* to 'root'@'%' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.002 sec)
MariaDB [(none)]> grant all privileges on *.* to 'mysql'@'%' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.008 sec)
MariaDB [(none)]> grant all privileges on *.* to 'root'@'localhost' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.002 sec)
MariaDB [(none)]> grant all privileges on *.* to 'mysql'@'localhost' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.008 sec)
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.001 sec)
MariaDB [(none)]> exit
Bye
[root@cont01:/root]# mysql_secure_installation
注：Disallow root login remotely? [Y/n] n
密码：typora#2019
[root@cont01:/root]# mysql
MariaDB [(none)]> SET password=PASSWORD('typora#2019');
//密码设置过的可以省略
Query OK, 0 rows affected (0.012 sec)
MariaDB [(none)]> grant all privileges on *.* to 'root'@'%' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.008 sec)
MariaDB [(none)]> grant all privileges on *.* to 'mysql'@'%' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.008 sec)
MariaDB [(none)]> grant all privileges on *.* to 'root'@'localhost' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.002 sec)
MariaDB [(none)]> grant all privileges on *.* to 'mysql'@'localhost' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.008 sec)
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.001 sec)
MariaDB [(none)]> exit
Bye
[root@cont03:/root]# mysql_secure_installation
注：Disallow root login remotely? [Y/n] n
密码：typora#2019
[root@cont03:/root]#  mysql
MariaDB [(none)]>  SET password=PASSWORD('typora#2019');
//密码设置过的可以省略
Query OK, 0 rows affected (0.009 sec)
MariaDB [(none)]> grant all privileges on *.* to 'root'@'localhost' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.002 sec)
MariaDB [(none)]> grant all privileges on *.* to 'mysql'@'localhost' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.008 sec)
MariaDB [(none)]> grant all privileges on *.* to 'root'@'%' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.016 sec)
MariaDB [(none)]> grant all privileges on *.* to 'mysql'@'%' identified by 'typora#2019' with grant option;
Query OK, 0 rows affected (0.008 sec)
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.001 sec)
MariaDB [(none)]>  exit
Bye

2.1.4 配置MariaDB Galeria 集群 (控制节点全修改配置文件/etc/my.cnf.d/server.cnf)

[root@cont02:/root]# vim /etc/my.cnf.d/server.cnf
[server]
# this is only for the mysqld standalone daemon
[mysqld]
#
# * Galera-related settings
#
[galera]
# Mandatory settings
##wsrep_on=ON   #开启全同步复制模式
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.10.22,192.168.10.23,192.168.10.21"
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
##innodb_autoinc_lock_mode=2   #主键自增模式修改为交叉模式
##wsrep_slave_threads=8  #开启并行复制线程，根据CPU核数设置
#
# Allow server to accept connections on all interfaces.
#
bind-address=192.168.10.22
#
# Optional setting
#wsrep_slave_threads=1
#innodb_flush_log_at_trx_commit=0   #事务提交每隔1秒刷盘 
##wsrep_certify_nonPK=ON #为没有显式申明主键的表生成一个用于certificationtest的主键，默认为ON
##query_cache_size=0      #关闭查询缓存 
##wsrep_provider_options="gcache.size=4G" #同步复制缓冲池
wsrep_provider_options="gcache.size=1G"
wsrep_cluster_name=MariaDB-Galera-Cluster
wsrep_node_name=cont02
wsrep_node_address=192.168.10.22
wsrep_sst_method=rsync
##innodb_buffer_pool_size=2G
# this is only for embedded server
[embedded]
# This group is only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
# This group is only read by MariaDB-10.4 servers.
# If you use the same .cnf file for MariaDB of different versions,
# use this group for options that older servers don't understand
[mariadb-10.4]
[root@cont01:/root]# vim /etc/my.cnf.d/server.cnf
[server]
# this is only for the mysqld standalone daemon
[mysqld]
#
# * Galera-related settings
#
[galera]
# Mandatory settings
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.10.21,192.168.10.22,192.168.10.23"
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
#
# Allow server to accept connections on all interfaces.
#
bind-address=192.168.10.21
#
# Optional setting
#wsrep_slave_threads=1
#innodb_flush_log_at_trx_commit=0
wsrep_provider_options="gcache.size=1G"
wsrep_cluster_name=MariaDB-Galera-Cluster
wsrep_node_name=cont01
wsrep_node_address=192.168.10.21
wsrep_sst_method=rsync
# this is only for embedded server
[embedded]
# This group is only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
# This group is only read by MariaDB-10.4 servers.
# If you use the same .cnf file for MariaDB of different versions,
# use this group for options that older servers don't understand
[mariadb-10.4]
[root@cont03:/root]# vim /etc/my.cnf.d/server.cnf
[server]
# this is only for the mysqld standalone daemon
[mysqld]
#
# * Galera-related settings
#
[galera]
# Mandatory settings
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_address="gcomm://192.168.10.23,192.168.10.22,192.168.10.21"
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
#
# Allow server to accept connections on all interfaces.
#
bind-address=192.168.10.23
#
# Optional setting
#wsrep_slave_threads=1
#innodb_flush_log_at_trx_commit=0
wsrep_provider_options="gcache.size=1G"
wsrep_cluster_name=MariaDB-Galera-Cluster
wsrep_node_name=cont03
wsrep_node_address=192.168.10.23
wsrep_sst_method=rsync
# this is only for embedded server
[embedded]
# This group is only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
# This group is only read by MariaDB-10.4 servers.
# If you use the same .cnf file for MariaDB of different versions,
# use this group for options that older servers don't understand
[mariadb-10.4]

2.1.5 控制节点停止mariadb.service

[root@cont01:/root]# systemctl daemon-reload
[root@cont01:/root]# systemctl stop mariadb.service
[root@cont02:/root]# systemctl daemon-reload
[root@cont02:/root]# systemctl stop mariadb.service
[root@cont03:/root]# systemctl daemon-reload
[root@cont03:/root]# systemctl stop mariadb.service

2.1.6 启动初始化集群

[root@cont02:/root]# mysqld_safe --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql --wsrep-new-cluster &
[root@cont03:/root]# systemctl start mariadb
[root@cont03:/root]# systemctl status mariadb
[root@cont03:/root]# systemctl enable mariadb
[root@cont01:/root]# systemctl start mariadb
[root@cont01:/root]# systemctl status mariadb
[root@cont01:/root]# systemctl enable mariadb
注：两个节点启动成功后，在重新启动第一个节点：（删除第一个节点数据）
[root@cont02:/root]# pkill -9 mysql
[1]+  Killed                  mysqld_safe --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql --wsrep-new-cluster
[root@cont02:/root]# rm -rf /var/lib/mysql/*
[root@cont02:/root]# systemctl start mariadb.service
[root@cont02:/root]# systemctl enable mariadb.service
Created symlink from /etc/systemd/system/mysql.service to /usr/lib/systemd/system/mariadb.service.
Created symlink from /etc/systemd/system/mysqld.service to /usr/lib/systemd/system/mariadb.service.
Created symlink from /etc/systemd/system/multi-user.target.wants/mariadb.service to /usr/lib/systemd/system/mariadb.service.
[root@cont02:/root]# systemctl status mariadb.service
注意：每次在第一台集群上，需要使用下面的命令来启动：
galera_new_cluster

2.2 登录数据库，查看集群状态

[root@cont02:/root]# mysql -uroot -p"typora#2019"
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 13
Server version: 10.4.11-MariaDB MariaDB Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
+--------------------+
3 rows in set (0.002 sec)
MariaDB [(none)]> show status like 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+
1 row in set (0.002 sec)
MariaDB [(none)]> show status like 'wsrep%';
+-------------------------------+----------------------------------------------------
| Variable_name                 | Value                                               
+-------------------------------+----------------------------------------------------
| wsrep_local_state_uuid        | a2b7231d-383a-11ea-b086-83d5778583b5             
| wsrep_protocol_version        | 10                                                 
| wsrep_last_committed          | 5                                                   
| wsrep_replicated              | 0                                                   
| wsrep_replicated_bytes        | 0                                                   
| wsrep_repl_keys               | 0                                                   
| wsrep_repl_keys_bytes         | 0                                                   
| wsrep_repl_data_bytes         | 0                                                   
| wsrep_repl_other_bytes        | 0                                                   
| wsrep_received                | 3                                                   
| wsrep_received_bytes          | 256                                                 
| wsrep_local_commits           | 0                                                   
| wsrep_local_cert_failures     | 0                                                   
| wsrep_local_replays           | 0                                                   
| wsrep_local_send_queue        | 0                                                   
| wsrep_local_send_queue_max    | 1                                                   
| wsrep_local_send_queue_min    | 0                                                   
| wsrep_local_send_queue_avg    | 0                                                   
| wsrep_local_recv_queue        | 0                                                   
| wsrep_local_recv_queue_max    | 1                                                   
| wsrep_local_recv_queue_min    | 0                                                   
| wsrep_local_recv_queue_avg    | 0                                                   
| wsrep_local_cached_downto     | 5                                                   
| wsrep_flow_control_paused_ns  | 0                                                   
| wsrep_flow_control_paused     | 0                                                   
| wsrep_flow_control_sent       | 0                                                   
| wsrep_flow_control_recv       | 0                                                   
| wsrep_cert_deps_distance      | 0                                                   
| wsrep_apply_oooe              | 0                                                   
| wsrep_apply_oool              | 0                                                   
| wsrep_apply_window            | 0                                                   
| wsrep_commit_oooe             | 0                                                   
| wsrep_commit_oool             | 0                                                   
| wsrep_commit_window           | 0                                                   
| wsrep_local_state             | 4                                                   
| wsrep_local_state_comment     | Synced                                             
| wsrep_cert_index_size         | 0                                                   
| wsrep_causal_reads            | 0                                                   
| wsrep_cert_interval           | 0                                                   
| wsrep_open_transactions       | 0                                                   
| wsrep_open_connections        | 0                                                   
| wsrep_incoming_addresses      | AUTO,AUTO,AUTO                                     
| wsrep_cluster_weight          | 3                                                   
| wsrep_desync_count            | 0                                                   
| wsrep_evs_delayed             |                                                     
| wsrep_evs_evict_list          |                                                     
| wsrep_evs_repl_latency        | 0/0/0/0/0                                           
| wsrep_evs_state               | OPERATIONAL                                         
| wsrep_gcomm_uuid              | d5a23bb7-383a-11ea-b735-86c7a20615b4               
| wsrep_applier_thread_count    | 1                                                   
| wsrep_cluster_capabilities    |                                                     
| wsrep_cluster_conf_id         | 18446744073709551615                               
| wsrep_cluster_size            | 3                                                   
| wsrep_cluster_state_uuid      | a2b7231d-383a-11ea-b086-83d5778583b5               
| wsrep_cluster_status          | Primary                                             
| wsrep_connected               | ON                                                 
| wsrep_local_bf_aborts         | 0                                                   
| wsrep_local_index             | 0                                                   
| wsrep_provider_capabilities   | :MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:TRX_REPLAY:ISOLATION:PAUSE:CAUSAL_READS:INCREMENTAL_WRITESET:UNORDERED:PREORDERED:STREAMING:NBO: |
| wsrep_provider_name           | Galera                                             
| wsrep_provider_vendor         | Codership Oy                    
| wsrep_provider_version        | 26.4.3(r4535)                                       
| wsrep_ready                   | ON                                                 
| wsrep_rollbacker_thread_count | 1                                                 
| wsrep_thread_count            | 2                                                   +-------------------------------+---------------------------------------------
65 rows in set (0.002 sec)
MariaDB [(none)]> exit
Bye
[root@cont01:/var/lib/mysql]#  cat grastate.dat gvwstate.dat 
# GALERA saved state
version: 2.1
uuid:    a2b7231d-383a-11ea-b086-83d5778583b5
seqno:   -1
safe_to_bootstrap: 0
my_uuid: b3a02e5e-383a-11ea-ab01-07face985b92
#vwbeg
view_id: 3 ac080e34-383a-11ea-9d9d-22f77c5f5aa6 5
bootstrap: 0
member: ac080e34-383a-11ea-9d9d-22f77c5f5aa6 0
member: b3a02e5e-383a-11ea-ab01-07face985b92 0
member: d5a23bb7-383a-11ea-b735-86c7a20615b4 0
#vwend
[root@cont02:/var/lib/mysql]#  cat grastate.dat gvwstate.dat 
# GALERA saved state
version: 2.1
uuid:    a2b7231d-383a-11ea-b086-83d5778583b5
seqno:   -1
safe_to_bootstrap: 0
my_uuid: d5a23bb7-383a-11ea-b735-86c7a20615b4
#vwbeg
view_id: 3 ac080e34-383a-11ea-9d9d-22f77c5f5aa6 5
bootstrap: 0
member: ac080e34-383a-11ea-9d9d-22f77c5f5aa6 0
member: b3a02e5e-383a-11ea-ab01-07face985b92 0
member: d5a23bb7-383a-11ea-b735-86c7a20615b4 0
#vwend
[root@cont03:/var/lib/mysql]# cat grastate.dat gvwstate.dat 
# GALERA saved state
version: 2.1
uuid:    a2b7231d-383a-11ea-b086-83d5778583b5
seqno:   -1
safe_to_bootstrap: 0
my_uuid: ac080e34-383a-11ea-9d9d-22f77c5f5aa6
#vwbeg
view_id: 3 ac080e34-383a-11ea-9d9d-22f77c5f5aa6 5
bootstrap: 0
member: ac080e34-383a-11ea-9d9d-22f77c5f5aa6 0
member: b3a02e5e-383a-11ea-ab01-07face985b92 0
member: d5a23bb7-383a-11ea-b735-86c7a20615b4 0
#vwend

2.3 每个控制节点/usr/bin 目录下面编辑心跳检测clustercheck脚本

[root@cont01:/root]# cd /usr/bin/
[root@cont01:/usr/bin]# wget https://raw.githubusercontent.com/olafz/percona-clustercheck/master/clustercheck
[root@cont01:/usr/bin]# chmod +x /usr/bin/clustercheck
[root@cont01:/usr/bin]# ls -ld /usr/bin/clustercheck
-rwxr-xr-x. 1 root root 3882 Jan 15 19:19 /usr/bin/clustercheck
[root@cont02:/root]# cd /usr/bin/
[root@cont02:/usr/bin]# wget https://raw.githubusercontent.com/olafz/percona-clustercheck/master/clustercheck
[root@cont02:/usr/bin]# chmod +x /usr/bin/clustercheck
[root@cont02:/usr/bin]# ls -ld /usr/bin/clustercheck
-rwxr-xr-x. 1 root root 3882 Jan 15 19:18 /usr/bin/clustercheck
[root@cont03:/root]# cd /usr/bin/
[root@cont03:/usr/bin]# wget https://raw.githubusercontent.com/olafz/percona-clustercheck/master/clustercheck
[root@cont03:/usr/bin]# chmod +x /usr/bin/clustercheck
[root@cont03:/usr/bin]# ls -ld /usr/bin/clustercheck
-rwxr-xr-x. 1 root root 3882 Jan 15 19:20 /usr/bin/clustercheck

3、部署xinetd服务 3.1 控制节点安装xinetd服务，用于监控Galera服务(cont01、cont02、cont03)

yum install xinetd -y

3.2 控制节点创建xinetd配置文件

[root@cont01:/root]# 
cat </etc/xinetd.d/mysqlchk
#default:on
##description:mysqlchk
service mysqlchk
{
        disable = no    
        flags   = REUSE
        socket_type     = stream
        port    = 9200
        wait    =       no
        user    = nobody
        server  = /usr/bin/clustercheck
        log_on_failure   += USERID
        only_from       = 0.0.0.0/0
        per_source      = UNLIMITED
}
END
[root@cont01:/root]# scp /etc/xinetd.d/mysqlchk cont02:/etc/xinetd.d/
[root@cont01:/root]# scp /etc/xinetd.d/mysqlchk cont03:/etc/xinetd.d/

3.3 控制节点修改/etc/services

echo ‘mysqlchk 9200/tcp #mysqlchk’>>/etc/services

[root@cont02:/root]# vim /etc/services
转到第10101行，注释使用端口9200的服务，并为mysqlchk / clustercheck添加一行。
mysqlchk        9200/tcp                # mysqlchk
#wap-wsp         9200/tcp                # WAP connectionless session service
#wap-wsp         9200/udp                # WAP connectionless session service
[root@cont01:/root]# vim /etc/services
转到第10101行，注释使用端口9200的服务，并为mysqlchk / clustercheck添加一行。
mysqlchk        9200/tcp                # mysqlchk
#wap-wsp         9200/tcp                # WAP connectionless session service
#wap-wsp         9200/udp                # WAP connectionless session service
[root@cont03:/root]# vim /etc/services
转到第10101行，注释使用端口9200的服务，并为mysqlchk / clustercheck添加一行。
mysqlchk        9200/tcp                # mysqlchk
#wap-wsp         9200/tcp                # WAP connectionless session service
#wap-wsp         9200/udp                # WAP connectionless session service

3.4 控制节点启动xinetd服务(cont01、cont02、cont03)

systemctl enable xinetd && systemctl start xinetd
systemctl status xinetd

3.5 故障场景的恢复

①首先判断当前数据库集群中是否有服务在启动着，如果有则直接启动服务即可，其中1个节点挂了，一般只需要重启A节点的服务即可。

② 如果当前所有节点的数据库服务都挂了，则需要找状态最新的那个节点让它携带–wsrep-new-cluster参数启动，启动起来之后其它节点直接启动服务即可。
查找最新节点策略：首先获取各节点的grastate.dat文件中的seqno值，值最大的那个就是最新的节点；如果所有节点的seqno都是-1，则去比较所有节点的gvwstate.dat文件中的my_uuid和view_id是否相等，相等的那个则作为第一个启动节点，第一个启动节点启动后，其它节点正常启动即可；如果依然未找到则需要人工干预来恢复了。

//启动
mysqld_safe --defaults-file=/etc/my.cnf.d/server.cnf --user=mysql --wsrep-new-cluster &

③如果以上两种方法全部无效。尝试采用下面操作(进过测试，此方法成功过)

第一步：数据库无法启动情况下把/var/lib/mysql下的几个文件：grastate.dat gvwstate.dat galera.cache mysql.sock统统删掉再重启

第二步：更改配置文件中的 “gcomm://” 是特殊的地址，仅仅是galera cluster初始化启动时候使用。如果集群启动以后，我们关闭了第一个节点，那么再次启动的时候必须先修改"gcomm://"为其他节点的集群地址，首次作为发起人则此处为空，不向任何人同步数据。

第三步：本机启动服务：systemctl start mariadb

第四部：启动集群：galera_new_cluster

第五步：启动其余节点上的mariadb: systemctl start mariadb

第六步：更改回 “gcomm://” 重启数据库systemctl restart mariadb

监测状态说明
集群完整性检查:
wsrep_cluster_state_uuid:在集群所有节点的值应该是相同的,有不同值的节点,说明其没有连接入集群.
wsrep_cluster_conf_id:正常情况下所有节点上该值是一样的.如果值不同,说明该节点被临时”分区”了.当节点之间网络连接恢复的时候应该会恢复一样的值.
wsrep_cluster_size:如果这个值跟预期的节点数一致,则所有的集群节点已经连接.
wsrep_cluster_status:集群组成的状态.如果不为”Primary”,说明出现”分区”或是”split-brain”状况.
节点状态检查:
wsrep_ready: 该值为ON,则说明可以接受SQL负载.如果为Off,则需要检查wsrep_connected.
wsrep_connected: 如果该值为Off,且wsrep_ready的值也为Off,则说明该节点没有连接到集群.(可能是wsrep_cluster_address或wsrep_cluster_name等配置错造成的.具体错误需要查看错误日志)
wsrep_local_state_comment:如果wsrep_connected为On,但wsrep_ready为OFF,则可以从该项查看原因.
复制健康检查:
wsrep_flow_control_paused:表示复制停止了多长时间.即表明集群因为Slave延迟而慢的程度.值为0~1,越靠近0越好,值为1表示复制完全停止.可优化wsrep_slave_threads的值来改善.
wsrep_cert_deps_distance:有多少事务可以并行应用处理.wsrep_slave_threads设置的值不应该高出该值太多.
wsrep_flow_control_sent:表示该节点已经停止复制了多少次.
wsrep_local_recv_queue_avg:表示slave事务队列的平均长度.slave瓶颈的预兆.
当前节点数据库状态
MariaDB [(none)]> show status like 'wsrep_local_state_comment';
+---------------------------+--------+
| Variable_name | Value |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
MariaDB [(none)]> show status like 'wsrep_cluster_size';      
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+
1 row in set (0.001 sec)
MariaDB [(none)]> show status like 'wsrep_cluster_state_uuid';
MariaDB [(none)]> show status like 'wsrep_cluster_status'; 
状态查询表：
Open  :节点启动成功，尝试连接到集群
Primary :节点已处于集群中，在新节点加入时，选取donor进行数据库同步时会产生的状态
Joiner  :节点处于等待接收或正在接收同步文件的状态
Joined  :节点完成数据同步，但还有部分数据不是最新的，在追赶与集群数据一致的状态
Synced  :节点正常提供服务的状态，表示当前节点数据状态与集群数据状态是一致的
Donor   :表示该节点被选为Donor节点，正在为新加进来的节点进行全量数据同步，此时该节点对客户端不提供服务

4、部署rabbitmq消息队列 4.1、所有控制节点安装rabbitmq和erlang软件包（cont01、cont02、cont03、~~mon01~~）

//节点：cont01
[root@cont01:/root]# yum install epel-release -y
[root@cont01:/root]# yum install -y erlang rabbitmq-server.noarch
[root@cont01:/root]# systemctl enable rabbitmq-server && systemctl start rabbitmq-server
[root@cont01:/root]# systemctl status rabbitmq-server
[root@cont01:/root]# netstat -ntlpa | grep 5672
//节点：cont02
[root@cont02:/root]# yum install epel-release -y
[root@cont02:/root]# yum install -y erlang rabbitmq-server.noarch
[root@cont02:/root]# systemctl enable rabbitmq-server && systemctl start rabbitmq-server
[root@cont02:/root]# systemctl status rabbitmq-server
[root@cont02:/root]# netstat -ntlpa | grep 5672
//节点：cont03
[root@cont03:/root]# yum install epel-release -y
[root@cont03:/root]# yum install -y erlang rabbitmq-server.noarch
[root@cont03:/root]# systemctl enable rabbitmq-server && systemctl start rabbitmq-server
[root@cont03:/root]# systemctl status rabbitmq-server
[root@cont03:/root]# netstat -ntlpa | grep 5672
//节点：mon01
[root@mon01:/root]# yum install epel-release -y
[root@mon01:/root]# yum install -y erlang rabbitmq-server.noarch
[root@mon01:/root]# systemctl enable rabbitmq-server && systemctl start rabbitmq-server
[root@mon01:/root]# systemctl status rabbitmq-server
[root@mon01:/root]# netstat -ntlpa | grep 5672

4.2 查看集群状态

（提示：Rabbitmq的配置文件/etc/rabbitmq/rabbitmq.config）

[root@cont01:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@cont01 ...
[{nodes,[{disc,[rabbit@cont01]}]},
 {running_nodes,[rabbit@cont01]},
 {cluster_name,<>},
 {partitions,[]}]
...done.
[root@cont02:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@cont02 ...
[{nodes,[{disc,[rabbit@cont02]}]},
 {running_nodes,[rabbit@cont02]},
 {cluster_name,<>},
 {partitions,[]}]
...done.
[root@cont03:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@cont03 ...
[{nodes,[{disc,[rabbit@cont03]}]},
 {running_nodes,[rabbit@cont03]},
 {cluster_name,<>},
 {partitions,[]}]
...done.
[root@mon01:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@mon01 ...
[{nodes,[{disc,[rabbit@mon01]}]},
 {running_nodes,[rabbit@mon01]},
 {cluster_name,<>},
 {partitions,[]}]
...done.
//*注：partitions里如果有节点一般非正常，可以通过重启rabbitmq-server服务来恢复*//

4.3 编辑rabbitmq变量文件

[root@cont01:/root]# vim /etc/rabbitmq/rabbitmq-env.conf
##
RABBITMQ_NODE_PORT=5672
ulimit -S -n 4096
RABBITMQ_SERVER_ERL_ARGS="+K true +A30 +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<>}] -kernel inet_default_listen_options [{raw,6,18,<>}]"
RABBITMQ_NODE_IP_ADDRESS=192.168.10.21
[root@cont02:/root]# vim /etc/rabbitmq/rabbitmq-env.conf
##
RABBITMQ_NODE_PORT=5672
ulimit -S -n 4096
RABBITMQ_SERVER_ERL_ARGS="+K true +A30 +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<>}] -kernel inet_default_listen_options [{raw,6,18,<>}]"
RABBITMQ_NODE_IP_ADDRESS=192.168.10.22
[root@cont03:/root]# vim /etc/rabbitmq/rabbitmq-env.conf
##
RABBITMQ_NODE_PORT=5672
ulimit -S -n 4096
RABBITMQ_SERVER_ERL_ARGS="+K true +A30 +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<>}] -kernel inet_default_listen_options [{raw,6,18,<>}]"
RABBITMQ_NODE_IP_ADDRESS=192.168.10.23
[root@mon01:/root]# vim /etc/rabbitmq/rabbitmq-env.conf 
##
RABBITMQ_NODE_PORT=5672
ulimit -S -n 4096
RABBITMQ_SERVER_ERL_ARGS="+K true +A30 +P 1048576 -kernel inet_default_connect_options [{nodelay,true},{raw,6,18,<>}] -kernel inet_default_listen_options [{raw,6,18,<>}]"
RABBITMQ_NODE_IP_ADDRESS=192.168.10.24

4.4 查看rabbitmq插件

[root@cont01:/root]# /usr/lib/rabbitmq/bin/rabbitmq-plugins list
[ ] amqp_client                       3.3.5
[ ] cowboy                            0.5.0-rmq3.3.5-git4b93c2d
[ ] eldap                             3.3.5-gite309de4
[ ] mochiweb                          2.7.0-rmq3.3.5-git680dba8
[ ] rabbitmq_amqp1_0                  3.3.5
[ ] rabbitmq_auth_backend_ldap        3.3.5
[ ] rabbitmq_auth_mechanism_ssl       3.3.5
[ ] rabbitmq_consistent_hash_exchange 3.3.5
[ ] rabbitmq_federation               3.3.5
[ ] rabbitmq_federation_management    3.3.5
[ ] rabbitmq_management               3.3.5
[ ] rabbitmq_management_agent         3.3.5
[ ] rabbitmq_management_visualiser    3.3.5
[ ] rabbitmq_mqtt                     3.3.5
[ ] rabbitmq_shovel                   3.3.5
[ ] rabbitmq_shovel_management        3.3.5
[ ] rabbitmq_stomp                    3.3.5
[ ] rabbitmq_test                     3.3.5
[ ] rabbitmq_tracing                  3.3.5
[ ] rabbitmq_web_dispatch             3.3.5
[ ] rabbitmq_web_stomp                3.3.5
[ ] rabbitmq_web_stomp_examples       3.3.5
[ ] sockjs                            0.3.4-rmq3.3.5-git3132eb9
[ ] webmachine                        1.10.3-rmq3.3.5-gite9359c7

4.5 所有node开启rabbitmq的web管理页面

rabbitmq-plugins enable rabbitmq_management
systemctl restart rabbitmq-server 
systemctl status rabbitmq-server

4.6 查看rabbitmq插件是否加载成功（成功后前面有E）

[root@cont01:/root]# /usr/lib/rabbitmq/bin/rabbitmq-plugins list
[e] amqp_client                       3.3.5
[ ] cowboy                            0.5.0-rmq3.3.5-git4b93c2d
[ ] eldap                             3.3.5-gite309de4
[e] mochiweb                          2.7.0-rmq3.3.5-git680dba8
[ ] rabbitmq_amqp1_0                  3.3.5
[ ] rabbitmq_auth_backend_ldap        3.3.5
[ ] rabbitmq_auth_mechanism_ssl       3.3.5
[ ] rabbitmq_consistent_hash_exchange 3.3.5
[ ] rabbitmq_federation               3.3.5
[ ] rabbitmq_federation_management    3.3.5
[E] rabbitmq_management               3.3.5
[e] rabbitmq_management_agent         3.3.5
[ ] rabbitmq_management_visualiser    3.3.5
[ ] rabbitmq_mqtt                     3.3.5
[ ] rabbitmq_shovel                   3.3.5
[ ] rabbitmq_shovel_management        3.3.5
[ ] rabbitmq_stomp                    3.3.5
[ ] rabbitmq_test                     3.3.5
[ ] rabbitmq_tracing                  3.3.5
[e] rabbitmq_web_dispatch             3.3.5
[ ] rabbitmq_web_stomp                3.3.5
[ ] rabbitmq_web_stomp_examples       3.3.5
[ ] sockjs                            0.3.4-rmq3.3.5-git3132eb9
[e] webmachine                        1.10.3-rmq3.3.5-gite9359c7

提示：rabbitmq: web默认端口15672 rabbitmq: server默认端口5672

4.7 访问：http://192.168.10.22:15672/

注：如果是阿里云服务器，需要打开15672端口.(后期可能有80端口被占用情况)

在这里插入图片描述

默认账户：guest/guest

4.8 所有node修改guest密码为adminrabbitmq(默认用户为：guest 密码为：guest)

[root@cont01:/root]# rabbitmqctl change_password guest adminrabbitmq
Changing password for user "guest" ...
...done.
[1]+  Done                    rabbitmqctl change_password guest admin
[root@cont02:/root]# rabbitmqctl change_password guest adminrabbitmq
Changing password for user "guest" ...
...done.
[1]+  Done                    rabbitmqctl change_password guest admin
[root@cont03:/root]# rabbitmqctl change_password guest adminrabbitmq
Changing password for user "guest" ...
...done.
[root@mon01:/root]# rabbitmqctl change_password guest adminrabbitmq
Changing password for user "guest" ...
...done.
[1]+  Done                    rabbitmqctl change_password guest admin

4.9 cont02添加一个openstack的用户，并设密码为adminopenstack。并设置权限成为管理员(成集群后，cont02和cont03上也可以登录)

[root@cont02:/root]# rabbitmqctl add_user openstack adminopenstack
Creating user "openstack" ...
...done.
[root@cont02:/root]# rabbitmqctl set_permissions openstack ".*" ".*" ".*"
Setting permissions for user "openstack" in vhost "/" ...
...done.
[root@cont02:/root]# rabbitmqctl set_user_tags openstack administrator
Setting tags for user "openstack" to [administrator] ...
...done.
[root@cont02:/root]# rabbitmqctl list_users
Listing users ...
guest	[administrator]
openstack	[administrator]
...done.

4.10 发送cont01上的erlang.cookie到其他节点配置集群

[root@cont02:/root]# rabbitmqctl status
Status of node rabbit@cont02 ...
[{pid,12512},
 {running_applications,
     [{rabbitmq_management,"RabbitMQ Management Console","3.3.5"},
      {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.3.5"},
      {webmachine,"webmachine","1.10.3-rmq3.3.5-gite9359c7"},
      {mochiweb,"MochiMedia Web Server","2.7.0-rmq3.3.5-git680dba8"},
      {rabbitmq_management_agent,"RabbitMQ Management Agent","3.3.5"},
      {rabbit,"RabbitMQ","3.3.5"},
      {os_mon,"CPO  CXC 138 46","2.2.14"},
      {inets,"INETS  CXC 138 49","5.9.8"},
      {mnesia,"MNESIA  CXC 138 12","4.11"},
      {amqp_client,"RabbitMQ AMQP Client","3.3.5"},
      {xmerl,"XML parser","1.3.6"},
      {sasl,"SASL  CXC 138 11","2.3.4"},
      {stdlib,"ERTS  CXC 138 10","1.19.4"},
      {kernel,"ERTS  CXC 138 10","2.16.4"}]},
 {os,{unix,linux}},
 {erlang_version,
     "Erlang R16B03-1 (erts-5.10.4) [source] [64-bit] [smp:8:8] [async-threads:30] [hipe] [kernel-poll:true]\n"},
 {memory,
     [{total,41896520},
      {connection_procs,5600},
      {queue_procs,5600},
      {plugins,385080},
      {other_proc,13866040},
      {mnesia,60888},
      {mgmt_db,50256},
      {msg_index,34528},
      {other_ets,1114200},
      {binary,74848},
      {code,20229052},
      {atom,711569},
      {other_system,5358859}]},
 {alarms,[]},
 {listeners,[{clustering,25672,"::"},{amqp,5672,"192.168.10.22"}]},
 {vm_memory_high_watermark,0.4},
 {vm_memory_limit,8331886592},
 {disk_free_limit,50000000},
 {disk_free,178624393216},
 {file_descriptors,
     [{total_limit,3996},
      {total_used,3},
      {sockets_limit,3594},
      {sockets_used,1}]},
 {processes,[{limit,1048576},{used,190}]},
 {run_queue,0},
 {uptime,369}]
...done.
[root@cont02:/root]# scp /var/lib/rabbitmq/.erlang.cookie cont01:/var/lib/rabbitmq/ 
[root@cont02:/root]# scp /var/lib/rabbitmq/.erlang.cookie cont03:/var/lib/rabbitmq/ 
[root@cont02:/root]# scp /var/lib/rabbitmq/.erlang.cookie mon01:/var/lib/rabbitmq/

4.11 cont01、cont03、mon01停止应用，并ram的方式加入cont01节点，之后重启应用（先查看cont01集群名字）

[root@cont02:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@cont02 ...
[{nodes,[{disc,[rabbit@cont02]}]},
 {running_nodes,[rabbit@cont02]},
 {cluster_name,<>},
 {partitions,[]}]
...done.
[root@cont01:/root]# systemctl restart rabbitmq-server
[root@cont01:/root]# systemctl status rabbitmq-server
[root@cont01:/root]# rabbitmqctl stop_app
Stopping node rabbit@cont01 ...
...done.
[root@cont01:/root]# rabbitmqctl join_cluster --ram rabbit@cont02
Clustering node rabbit@cont01 with rabbit@cont02 ...
...done.
[root@cont01:/root]# rabbitmqctl start_app
Starting node rabbit@cont01 ...
...done.
[root@cont03:/root]# systemctl restart rabbitmq-server
[root@cont03:/root]# systemctl status rabbitmq-server
[root@cont03:/root]# rabbitmqctl stop_app
[root@cont03:/root]# rabbitmqctl join_cluster --ram rabbit@cont02
Clustering node rabbit@cont03 with rabbit@cont02 ...
...done.
[root@cont03:/root]# rabbitmqctl start_app
Starting node rabbit@cont03 ...
...done.
[root@mon01:/root]# systemctl restart rabbitmq-server
[root@mon01:/root]# systemctl status rabbitmq-server
[root@mon01:/root]# rabbitmqctl stop_app
Stopping node rabbit@mon01 ...
...done.
[root@mon01:/root]# rabbitmqctl join_cluster --ram rabbit@cont02
Clustering node rabbit@mon01 with rabbit@cont02 ...
...done.
[root@mon01:/root]# rabbitmqctl start_app
Starting node rabbit@mon01 ...
...done.

4.12 检测集群状态

[root@cont01:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@cont01 ...
[{nodes,[{disc,[rabbit@cont02]},
         {ram,[rabbit@mon01,rabbit@cont03,rabbit@cont01]}]},
 {running_nodes,[rabbit@mon01,rabbit@cont03,rabbit@cont02,rabbit@cont01]},
 {cluster_name,<>},
 {partitions,[]}]
...done.
[root@cont02:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@cont02 ...
[{nodes,[{disc,[rabbit@cont02]},
         {ram,[rabbit@mon01,rabbit@cont03,rabbit@cont01]}]},
 {running_nodes,[rabbit@mon01,rabbit@cont03,rabbit@cont01,rabbit@cont02]},
 {cluster_name,<>},
 {partitions,[]}]
...done.
[root@cont03:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@cont03 ...
[{nodes,[{disc,[rabbit@cont02]},
         {ram,[rabbit@mon01,rabbit@cont03,rabbit@cont01,]}]},
 {running_nodes,[rabbit@mon01,rabbit@cont01,rabbit@cont02,rabbit@cont03]},
 {cluster_name,<>},
 {partitions,[]}]
...done.
[root@mon01:/root]# rabbitmqctl cluster_status
Cluster status of node rabbit@mon01 ...
[{nodes,[{disc,[rabbit@cont02]},
         {ram,[rabbit@mon01,rabbit@cont03,rabbit@cont01]}]},
 {running_nodes,[rabbit@cont01,rabbit@cont02,rabbit@cont03,rabbit@mon01]},
 {cluster_name,<>},
 {partitions,[]}]
...done.

4.13 为了保持rabbitmq的消息一致性，可以设置消息镜像模式策略所有节点执行

rabbitmqctl set_policy ha-all "^" '{"ha-mode":"all"}'

优化RabbitMQQ性能

在RabbitMQQ中，socket descriptors 和 File descriptors的默认配额并不大

修改配置

在其配置文件中添加如下参数，其值请根据实际情况进行调整：

vim /usr/lib/systemd/system/rabbitmq-erver.service

[Service]

LimitNOFILE = 16384

重启rabbitmq

systemctl daemon-reload

systemctl restart rabbitmq-server

//下面为引用参考文案，如下：(节后调整)

在为OpenStack的服务配置使用rabbitmq消息队列服务时，可以如下配置：

transport_url = rabbit://openstack:adminopenstack@cont01,openstack:adminopenstack@cont02,openstack:adminopenstack@cont03

可以看到这里的配置方式是将所有节点的rabbitmq服务以拼接方式拼在一起，当cont1节点上的消息队列服务不可用时可以将请求转发给cont2，再不行可以再转发给cont3节点.

同时应该配置如下参数：

rabbit_retry_interval=1

rabbit_retry_backoff=2

rabbit_max_retries=0

rabbit_durable_queues=true

rabbit_ha_queues=true

4.14 RabbitMQ注意小计（故障恢复提示）：

①保证集群中至少有一个磁盘类型的节点以防数据丢失，在更改节点类型时尤其要注意。

②若整个集群被停掉了，应保证最后一个down掉的节点被最先启动，若不能则要使用forget_cluster_node命令将其移出集群。

③若集群中节点几乎同时以不可控的方式down 了，此时再其中一个节点使用force_boot 命令重启节点。

④如果加入集群后，意外关闭等造成rabbitmq-server启动不成功，可以尝试一下步骤：/var/lib/rabbitmq/mnesia 目录下存在rabbit@localhost.pid、rabbit@localhost、rabbit@localhost-plugins-expand，删除这3项后，并且删除 /var/lib/rabbitmq/ 目录下 .erlang.cookie和erl_crash.dump 再使用systemctl start rabbitmq-server启动

5、部署PCS 5.1 安装pcs+pacemaker+corosync （cont01、cont02和 cont03）

所有控制节点安装pcs、pacemaker、corosync, pacemaker是资源管理器，corosync提供心跳机制。

[root@cont01:/root]# yum install -y lvm2 cifs-utils quota psmisc pcs pacemaker corosync fence-agents-all resource-agents crmsh
[root@cont02:/root]# yum install -y lvm2 cifs-utils quota psmisc pcs pacemaker corosync fence-agents-all resource-agents crmsh
[root@cont03:/root]# yum install -y lvm2 cifs-utils quota psmisc pcs pacemaker corosync fence-agents-all resource-agents crmsh
[root@cont01:/root]# systemctl enable pcsd corosync
[root@cont02:/root]# systemctl enable pcsd corosync
[root@cont03:/root]# systemctl enable pcsd corosync
[root@cont01:/root]# systemctl start pcsd && systemctl status pcsd
[root@cont02:/root]# systemctl start pcsd && systemctl status pcsd
[root@cont03:/root]# systemctl start pcsd && systemctl status pcsd

5.2 设置集群密码，而且三个节点密码需一直为：pcs#typora

[root@cont01:/root]# echo "pcs#typora" |passwd --stdin hacluster
[root@cont02:/root]# echo "pcs#typora" |passwd --stdin hacluster
[root@cont03:/root]# echo "pcs#typora" |passwd --stdin hacluster

5.3 控制节点创建配置文件corosync.conf

[root@cont02:/root]# cat </etc/corosync/corosync.conf
totem { 
        version: 2
        secauth:off
        cluster_name:openstack-cluster
        transport:udpu
}
nodelist {
        node {
                ring0_addr:cont01 
                nodeid:1 
        }
        node {
                ring0_addr:cont02
                nodeid:2
        }
        node {
                ring0_addr:cont03
                nodeid:3
        }
}
logging {
        to_logfile: yes
        logfile: /var/log/cluster/corosync.log
        to_syslog: yes
        debug: off
}
quorum {
        provider: corosync_votequorum
}
EOF
[root@cont02:/root]# scp /etc/corosync/corosync.conf cont01:/etc/corosync/
[root@cont02:/root]# scp /etc/corosync/corosync.conf cont03:/etc/corosync/

5.4 配置集群，设置集群互相认证

ssh-keygen
ssh-copy-id cont01
ssh-copy-id cont02
ssh-copy-id cont03

5.5 配置节点认证

[root@cont02:/root]# pcs cluster auth cont01 cont02 cont03 -u hacluster -p"pcs#typora"
cont02: Authorized
cont03: Authorized
cont01: Authorized

pcs cluster auth cont01 cont02 -u hacluster -p {password}

{password}表示为刚才设置的密码

5.6 创建集群

[root@cont02:/root]# pcs cluster setup --force  --name openstack-cluster cont01 cont02 cont03
Destroying cluster on nodes: cont01, cont02, cont03...
cont02: Stopping Cluster (pacemaker)...
cont03: Stopping Cluster (pacemaker)...
cont01: Stopping Cluster (pacemaker)...
cont01: Successfully destroyed cluster
cont02: Successfully destroyed cluster
cont03: Successfully destroyed cluster
Sending 'pacemaker_remote authkey' to 'cont01', 'cont02', 'cont03'
cont01: successful distribution of the file 'pacemaker_remote authkey'
cont03: successful distribution of the file 'pacemaker_remote authkey'
cont02: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
cont01: Succeeded
cont02: Succeeded
cont03: Succeeded
Synchronizing pcsd certificates on nodes cont01, cont02, cont03...
cont02: Success
cont03: Success
cont01: Success
Restarting pcsd on the nodes in order to reload the certificates...
cont02: Success
cont03: Success
cont01: Success

5.7 启动集群并查看集群状态

[root@cont02:/root]# pcs cluster enable --all
cont01: Cluster Enabled
cont02: Cluster Enabled
cont03: Cluster Enabled
[root@cont02:/root]# pcs cluster start --all
cont01: Starting Cluster (corosync)...
cont02: Starting Cluster (corosync)...
cont03: Starting Cluster (corosync)...
cont01: Starting Cluster (pacemaker)...
cont03: Starting Cluster (pacemaker)...
cont02: Starting Cluster (pacemaker)...
[root@cont02:/root]# pcs cluster status
Cluster Status:
 Stack: corosync
 Current DC: cont03 (version 1.1.20-5.el7_7.2-3c4c782f70) - partition with quorum
 Last updated: Wed Jan 15 21:44:10 2020
 Last change: Wed Jan 15 21:43:53 2020 by hacluster via crmd on cont03
 3 nodes configured
 0 resources configured
PCSD Status:
  cont02: Online
  cont03: Online
  cont01: Online
[root@cont02:/root]# ps aux | grep pacemaker
root     15586  0.0  0.0 132972  8700 ?        Ss   21:43   0:00 /usr/sbin/pacemakerd -f
haclust+ 15587  0.1  0.0 136244 14620 ?        Ss   21:43   0:00 /usr/libexec/pacemaker/cib
root     15588  0.0  0.0 136064  7664 ?        Ss   21:43   0:00 /usr/libexec/pacemaker/stonithd
root     15589  0.0  0.0  98836  4372 ?        Ss   21:43   0:00 /usr/libexec/pacemaker/lrmd
haclust+ 15590  0.0  0.0 128068  6620 ?        Ss   21:43   0:00 /usr/libexec/pacemaker/attrd
haclust+ 15591  0.0  0.0  80508  3500 ?        Ss   21:43   0:00 /usr/libexec/pacemaker/pengine
haclust+ 15592  0.0  0.0 140380  8260 ?        Ss   21:43   0:00 /usr/libexec/pacemaker/crmd
root     15632  0.0  0.0 112712   960 pts/0    S+   21:44   0:00 grep --color=auto pacemaker

5.8 检验corosync状态

（执行crm_verify -L -V时会报错）

[root@cont02:/root]# corosync-cfgtool -s
Printing ring status.
Local node ID 2
RING ID 0
	id	= 192.168.10.22
	status	= ring 0 active with no faults
[root@cont02:/root]# corosync-cmapctl | grep members
runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.10.21) 
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.10.22) 
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined
runtime.totem.pg.mrp.srp.members.3.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.3.ip (str) = r(0) ip(192.168.10.23) 
runtime.totem.pg.mrp.srp.members.3.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.3.status (str) = joined
[root@cont02:/root]# pcs status corosync
Membership information
----------------------
    Nodeid      Votes Name
         1          1 cont01
         2          1 cont02 (local)
         3          1 cont03
[root@cont02:/root]# crm_verify -L -V
   error: unpack_resources:	Resource start-up disabled since no STONITH resources have been defined
   error: unpack_resources:	Either configure some or disable STONITH with the stonith-enabled option
   error: unpack_resources:	NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
[root@cont02:/root]# pcs property set stonith-enabled=false
[root@cont02:/root]# pcs property set no-quorum-policy=ignore
[root@cont02:/root]# crm_verify -L -V 
[root@cont02:/root]# pcs property set pe-warn-series-max=1000 pe-input-series-max=1000 pe-error-series-max=1000
[root@cont02:/root]# pcs property set cluster-recheck-interval=1000

检测配置（执行时会报错）

crm_verify -L -V

根据检测结果

pcs property set stonith-enabled=false #禁用STONITH

pcs property set no-quorum-policy=ignore #忽略仲裁

crm_verify -L -V #error 消失了

crm 交互式命令，比较好用

设置合适的输入处理历史记录及策略引擎生成的错误与警告

pcs property set pe-warn-series-max=1000 pe-input-series-max=1000 pe-error-series-max=1000

基于时间驱动的方式进行状态处理

pcs property set cluster-recheck-interval=1000

5.9 创建集群VIP （预设VIP地址为192.168.10.20）

[root@cont02:/root]# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.10.20 cidr_netmask=32 op monitor interval=30s

6、部署Haproxy 6.1 所有控制节点安装Haproxy，并保持相同的配置，对于配置文件，建议根据实际情况进行优化。 6.1.1安装并配置haproxy

[root@cont01:/root]# yum -y install haproxy
[root@cont02:/root]# yum -y install haproxy
[root@cont03:/root]# yum -y install haproxy
[root@cont02:/root]#  vim /etc/rsyslog.d/haproxy.conf
##配置HAProxy的日志
$ModLoad imudp
$UDPServerRun 514
$template Haproxy,"%msg%n"
local3.info -/var/log/haproxy.log;Haproxy
local3.notice -/var/log/haproxy-status.log;Haproxy
local3.*~
[root@cont02:/root]# scp /etc/rsyslog.d/haproxy.conf cont01:/etc/rsyslog.d/
[root@cont02:/root]# scp /etc/rsyslog.d/haproxy.conf cont03:/etc/rsyslog.d/
[root@cont01:/root]# systemctl enable haproxy.service
[root@cont01:/root]# systemctl restart rsyslog.service
[root@cont01:/root]# syste
                    
 
                

                            CentOS 7
                            集群
                            openstack
                            高可用
                            centos


           
    
    

            
                
                    
                
            
            
                
    
        
            需要 登录 后方可回复, 如果你还没有账号请 注册新账号
        
    
                
            
                
                    
                        相关文章

    
        
    
    
        
            CSS 盒子模型
        
        
            Lark
            2020-09-03
        
    
    
        886
    


    
        
            ionic 表单和输入框
        
        
            Githa
            2021-05-25
        
    
    
        926
    


    
        
            XQuery 术语
        
        
            Olinda
            2020-09-16
        
    
    
        533
    


    
        
    
    
        
            Ubuntu不进入休眠模式的方法
        
        
            Delfina
            2020-08-10
        
    
    
        716
    


    
        
            IE6兼容png24透明滤镜写法 图片路径是以页面为基点
        
        
            Paloma
            2021-03-25
        
    
    
        656
    


    
        
    
    
        
            nginx搭建高可用集群的实现方法
        
        
            Danica
            2023-01-29
        
    
    
        1704
    


    
        
    
    
        
            centos7安装java的多种方式总结
        
        
            Gitana
            2023-02-01
        
    
    
        1168
    


    
        
    
    
        
            Redis分片集群的实现
        
        
            Grizelda
            2023-02-01
        
    
    
        1308
    


    
        
    
    
        
            Spark 集群执行任务失败的故障处理方法
        
        
            Damara
            2023-02-24
        
    
    
        615
    


    
        
            VMware Workstation Pro 16安装CentOS7超详细图文步骤
        
        
            Fern
            2023-02-26
        
    
    
        1106
    


    
        
            centos8中使用yum安装nginx的详细过程
        
        
            Maleah
            2023-03-06
        
    
    
        1191
    


    
        
    
    
        
            centos7安装zabbix 5.0的详细过程
        
        
            Elsa
            2023-03-09
        
    
    
        1380
    


    
        
    
    
        
            Centos7.9安装MySQL8.0.32的详细教程
        
        
            Thalia
            2023-03-19
        
    
    
        1225
    


    
        
    
    
        
            利用nginx实现动静分离的负载均衡集群实战教程
        
        
            Kita
            2023-03-29
        
    
    
        1169
    


    
        
            Centos系统服务器查看端口是否开放的方法
        
        
            Tia
            2023-03-31
        
    
    
        894
    


    
        
    
    
        
            xxl-job带参数执行和高可用部署方法
        
        
            Raissa
            2023-04-06
        
    
    
        1865
    


    
        
            在Centos下安装python
        
        
            Tallulah
            2023-04-20
        
    
    
        643
    


    
        
            sql  server 2012高可用解决方案
        
        
            Nora
            2023-05-04
        
    
    
        1712
    


    
        
    
    
        
            centos搭建code-server配置HTTPS登录页自定义实现步骤
        
        
            Victoria
            2023-05-09
        
    
    
        501
    


    
        
    
    
        
            CentOS环境下Nginx配置SSL证书实现https请求详解
        
        
            Maleah
            2023-05-09
        
    
    
        392


        
    
        
            我要提问
        
    
    
        
        
    
        致谢
        
            帮助他人，成就自己。
            人生最大成功就是伸出热情而温暖的双手，尽自己所能去帮助身边的每一个人，只要无私的奉献，就会收获到美好的生活。
            1024问感谢每一位朋友的帮助和支持。
            软件开发网提供编程的基础软件技术培训教程,软件开发编程实例讲解Go,Node,HTML,CSS,Javascript,Python,Java,Ruby,C,PHP,MySQL等软件开发编程语言以及数据开发的基础知识，也提供大量的软件开发在线实例、从入门到精通就在1024问。
        
    
    
        
            
    育儿网
    微养生
    全球行
    美食街
    育儿
    菜谱大全
    海南旅游
    女性
    养狗百科
    星座