目录
前言:
在使用 Docker 搭建 Zookeeper3.5 集群的过程中遇到很多问题,这里记录一下,为今后做个参考。
本文是使用的 zookeeper 是 3.5 版本搭建的集群,有关单机版zookeeper由于比较简单,直接参考 Docker Hub 中的示例运行即可,这里不再赘述,另外,有关zookeeper3.4版本集群的搭建可以参考这篇文章:https://my.oschina.net/dslcode/blog/1944775
一、系统安装环境:
在 VMware12 创建3台虚拟机,并安装 Docker。
- 虚拟机:CentOS7.0+
- Docker:1.13.1
- Zookeeper:3.5
三台CentOS7虚拟机 IP:
192.168.121.66
192.168.121.67
192.168.121.70
二、前期准备工作
(1)关闭SELinux
# 临时关闭SELinux
[root@localhost ~]# setenforce 0
# 查看当前SELinux状态
[root@localhost ~]# getenforce
Permissive
# 永久修改SELinux
[root@localhost ~]# vi /etc/sysconfig/selinux
# 将SELINUX的值修改为disabled,重启后生效
SELINUX=disabled
(2)关闭防火墙
# 查看防火墙状态
[root@localhost ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
# 临时关闭防火墙
[root@localhost ~]# systemctl stop firewalld
# 永久关闭防火墙
[root@localhost ~]# systemctl disable firewalld
三、正式开始安装
(1)安装Docker(可以配置加速镜像)
# CentOS7安装Docker
[root@localhost ~]# yum -y install docker
(2)下载并安装 zookeeper3.5 (使用 docker run 会自动下载镜像)
#说明
# -p 是指定容器与宿主机映射的端口号
# -v 是指定容器内目录挂载到宿主机目录位置
# -e 是设置容器的环境变量
# --restart=always 指定每次 docker 服务启动后容器也会启动
# --name 指定容器名称
# --net=host 指定网络模式为 host
#在ip:192.168.121.66 上执行:
docker run -d -p 2181:2181 -p 2888:2888 -p 3888:3888 --restart=always --name=zk-master01 --net=host \
-v /opt/zookeeper/logs:/opt/zookeeper/logs \
-v /opt/zookeeper/data:/data \
-v /opt/zookeeper/datalog:/datalog \
-e "ZOO_MY_ID=1" \
-e "ZOO_SERVERS=server.1=0.0.0.0:2888:3888;2181 server.2=192.168.121.67:2888:3888;2181 server.3=192.168.121.70:2888:3888;2181" \
zookeeper:3.5
#在ip:192.168.121.67 上执行:
docker run -d -p 2181:2181 -p 2888:2888 -p 3888:3888 --restart=always --name=zk-master02 --net=host \
-v /opt/zookeeper/logs:/opt/zookeeper/logs \
-v /opt/zookeeper/data:/data \
-v /opt/zookeeper/datalog:/datalog \
-e "ZOO_MY_ID=2" \
-e "ZOO_SERVERS=server.1=192.168.121.66:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=192.168.121.70:2888:3888;2181" \
zookeeper:3.5
#在ip:192.168.121.70 上执行:
docker run -d -p 2181:2181 -p 2888:2888 -p 3888:3888 --restart=always --name=zk-master03 --net=host \
-v /opt/zookeeper/logs:/opt/zookeeper/logs \
-v /opt/zookeeper/data:/data \
-v /opt/zookeeper/datalog:/datalog \
-e "ZOO_MY_ID=3" \
-e "ZOO_SERVERS=server.1=192.168.121.66:2888:3888;2181 server.2=192.168.121.67:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181" \
zookeeper:3.5
(3)查看容器是否启动成功
# 查看运行中的所有容器信息
[root@localhost ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
09807cf4c27a zookeeper:3.5 "/docker-entrypoin..." 41 seconds ago Up 39 seconds zk-master01
# 查看容器日志信息
[root@localhost ~]# docker logs [容器ID]
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
2020-01-07 13:39:55,407 [myid:] - INFO [main:QuorumPeerConfig@133] - Reading configuration from: /conf/zoo.cfg
2020-01-07 13:39:55,418 [myid:] - INFO [main:QuorumPeerConfig@375] - clientPort is not set
2020-01-07 13:39:55,419 [myid:] - INFO [main:QuorumPeerConfig@389] - secureClientPort is not set
2020-01-07 13:39:55,532 [myid:1] - INFO [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2020-01-07 13:39:55,533 [myid:1] - INFO [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
....
启动过程中,这个报错是正常的,因为三台机器不是同时启动,
中间会有一段时间找不到其他两台机器
2020-01-07 13:39:56,799 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):FastLeaderElection@885] - New election. My id = 1, proposed zxid=0x600000048
2020-01-07 13:39:56,809 [myid:1] - WARN [WorkerSender[myid=1]:QuorumCnxManager@679] - Cannot open channel to 2 at election address /192.168.121.67:3888
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:650)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:707)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:620)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:477)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:456)
at java.lang.Thread.run(Thread.java:748)
2020-01-07 13:39:56,814 [myid:1] - WARN [WorkerSender[myid=1]:QuorumCnxManager@679] - Cannot open channel to 3 at election address /192.168.121.70:3888
java.net.ConnectException: Connection refused (Connection refused)
...
中间内容太多,这里就省略了。。。
2020-01-07 13:40:06,966 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@69] - FOLLOWING - LEADER ELECTION TOOK - 30 MS
2020-01-07 13:40:07,157 [myid:1] - WARN [NIOWorkerThread-2:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2020-01-07 13:40:07,164 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Learner@391] - Getting a diff from the leader 0x60000004c
2020-01-07 13:40:07,166 [myid:1] - WARN [QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Learner@454] - Got zxid 0x600000049 expected 0x1
2020-01-07 13:40:07,166 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Learner@546] - Learner received NEWLEADER message
2020-01-07 13:40:07,472 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Learner@529] - Learner received UPTODATE message
2020-01-07 13:40:07,477 [myid:1] - INFO [QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):CommitProcessor@256] - Configuring CommitProcessor with 1 worker threads.
2020-01-07 13:40:07,503 [myid:1] - INFO [SyncThread:1:FileTxnLog@216] - Creating new log file: log.600000049
2020-01-07 13:40:10,502 [myid:1] - WARN [QuorumPeer[myid=1](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@125] - Got zxid 0x700000001 expected 0x1
2020-01-07 13:40:14,245 [myid:1] - INFO [/0.0.0.0:3888:QuorumCnxManager$Listener@918] - Received connection request 192.168.121.70:56340
2020-01-07 13:40:14,250 [myid:1] - INFO [WorkerReceiver[myid=1]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x600000050 (n.zxid), 0x1 (n.round), LOOKING (n.state), 3 (n.sid), 0x6 (n.peerEPoch), FOLLOWING (my state)0 (n.config version)
# 进入zk容器内查看服务器是否启动
[root@localhost ~]# docker exec -it [容器ID] bash
root@localhost:/apache-zookeeper-3.5.6-bin# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: leader(代表这台机器被选举为leader,其他两台就是follower)
(4)使用 zookeeper 连接工具测试连接(这里zk节点中的数据是后面加的)
四、搭建过程中的一些说明及遇到的问题总结
(1)使用 docker run 命令安装 zookeeper3.5 时,docker 会从本地先查找是否已经有该镜像,如果没有会从远程仓库中下载再安装。
(2)变量 ZOO_MY_ID 代表每台 zookeeper 主机的 ID, 必须指定并且不可以重复。
(3)使用 Docker 方式搭建环境时,强烈建议先去 DockerHub 官网上搜索相关镜像,参考官方示例进行搭建。
(4)Zookeeper3.5 和 Zookeeper3.4 的搭建方式有所不同。一开始没有看到, Zookeeper3.5 中指定的 ZOO_SERVERS 参数的 IP 地址和端口号后面多加了 “;2181 ”,导致一直安装报错。
(5)当安装过程中遇到问题时,首先应该从日志中查看,使用 docker logs [容器ID] 查看指定容器的日志信息,如果还不能解决,则可以进入到容器内部查看(使用 docker exec -it [容器ID] bash; 再输入 /bin/zkServer.sh status 查看服务是否启动成功)。
(6)ZOO_SERVERS 指定ip时本机的ip地址写 0.0.0.0
(7)SElinux 和 防火墙必须关闭,如果只设置了临时生效,重启虚拟机后需要重新关闭,每次最好重启 dokcer 服务(systemctl restart docker)。
(8)连接工具 ZooInspector 连接服务器必须在服务启动后连接,如果服务重启了,ZooInspector(客户端)也需要重新启动后再连接。
(9)docker不可以直接挂载文件,只能挂载文件夹,如果指定了挂载文件,如:-v /opt/zookeeper/conf/zoo.cfg:/conf/zoo.cfg 则会一直提示 /docker-entrypoint.sh: line 15: /conf/zoo.cfg: Is a directory 错误,解决办法见:https://www.cnblogs.com/leon-ytparty/p/10824741.html。
总结:之前极少有自己搭建集群的经历,在本次搭建过程中遇到了很多问题,其实后面搭建完成后发现,在以后类似的工作中要总结这次经验教训,首先在搭建之前应该查看官方文档,比如 Docker Hub 或者软件官网;其次,在遇到错误时,应该首先查看日志,如果问题太多,则应该进入容器内查看;最后,在遇到问题时,一定要学会冷静沉着、仔细推理,使用排除法一点点排查。
后面如果有在搭建过程中遇到新的问题,或者觉得本文有错误之处,欢迎留言指正。
后记:
如果想通过主机名彼此通信,可以先在 /etc/hosts 文件中配置映射信息,然后将 ZOO_SERVERS 参数中的 IP 换成对应机器的主机名。
下一篇会介绍使用 Docker 搭建 Mesos + Zookeeper + Chronos 集群
参考地址:
DockerHub: https://hub.docker.com/_/zookeeper
docker zookeeper 集群搭建:https://my.oschina.net/dslcode/blog/1944775
[异常笔记] zookeeper集群启动异常: Cannot open channel to 2 at election address ……:https://www.cnblogs.com/tocode/p/10693715.html
zookeeper集群启动报错:Cannot open channel to * at election address /ip:3888:https://www.cnblogs.com/cirrus2011/p/11296648.html
Zookeeper查看工具 ZooInspector:https://blog.csdn.net/uisoul/article/details/78226324
Centos 7如何临时和永久关闭selinux:https://jingyan.baidu.com/article/7e4409537177d32fc0e2efe9.html
docker 挂载文件出错:https://www.cnblogs.com/leon-ytparty/p/10824741.html