环境
系统平台:UOS (飞腾)
版本:4.5.8
BUG/漏洞编码
3043
症状
BUG安装包: hgdb-see-4.5.8-db43858.aarch64.rpm
异常:hac集群一主两备环境,开启hgproxy和审计功能后,进行集群主备切换操作。切换过程持续近5分钟,主库方可切换到备节点上,其中未参与切换(原主与新主外的备节点)的备库无法拉起,持续restarting状态很长时间后start failed。
[root@tianxingao ~]# hghactl switchover see_cluster
Current cluster topology
+ Cluster: see_cluster --------+--------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+---------------------+--------+---------+----+-----------+
| hg_01 | x.x.66.135:5866 |Replica | running | 12 | 0 |
| hg_02 | x.x.66.135:5866 |Leader | running | 12 | |
| hg_03 | x.x.66.135:5866 |Replica | running | 12 | 0 |
+--------+---------------------+--------+---------+----+-----------+
Primary [hg_02]:
candidate ['hg_0l', "hg_03'] []: hg_01
When should the switchover take place (e.g, 2023-09-26T18:24 ) [now]:
Are you sure you want to switchover cluster see cluster, demoting current leader hg_02? [y/N]: y
Switchover failed, details: 503,Switchover status unknown
数据库日志报错:
2023-09-26 20:29:32.CST,,,6512ceac,1a1f6,3."",2023-09-26 20:29:32 CST,,0,PANTC,XX000,"cannot wait without a PGPROC structure".........""
触发条件
开启审计功能后执行集群切换主节点操作即可复现。
解决方案
1、临时解决方案:关闭审计功能,修改审计参数后需要重启集群。
[root@tianxingao hghac]# psql highgo syssao
Password for user syssao:
highgo=> select set_audit_param('hg_audit','off');
set_audit_param
---------------------------------
set configuration successfully.
(1 row)
highgo=> \q
[root@tianxingao hghac]# hghactl restart see_cluster
2、永久解决方案
解决此问题的安装包:hgdb-see-4.5.8-6954d9f.aarch64.rpm
重新安装修复此问题的新版本数据库包可解决。