前提
-
四台机器全部安装jdk
-
同步时钟
[root@CentOS01 ~]# date
Thu Dec 12 04:26:56 CST 2019
[root@CentOS01 ~]# date -s "2019-12-30 9:30:59" #设置每个虚拟机
原因:当时钟不同步时 后续四台机器同步工作咴ping超时 误差在3秒内可以
技巧:在VMware中-查看-撰写-撰写栏:打开全部会话可同步一起操作
- 查看机器别名是否都有
[root@CentOS01 ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=CentOS01
- 查看映射
[root@CentOS01 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.198.101 CentOS01
192.168.198.102 CentOS02
192.168.198.103 CentOS03
192.168.198.104 CentOS04
有四台机器都有映射,互相ping IP 和 ‘别名’ 都能 Ping通
- 查看安全机制文件
[root@CentOS01 ~]# cat /etc/sysconfig/selinux
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled # **此处为disable**
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
主要查看:SELINUX=disabled
- 查看防火墙并关闭
# 查看防火墙状态
service iptables status
# 停止防火墙
service iptables stop
# 启动防火墙
service iptables start
# 重启防火墙
service iptables restart
# 永久关闭防火墙
chkconfig iptables off
# 永久关闭后重启
chkconfig iptables on
安装hadoop全分布
ssh免密(自主写)
ssh免密(分发)
- 分发密钥
谁是管理节点谁分发密钥,将管理节点的密钥分发给其他机器
分发/root/.ssh/id_dsa.pub文件 - 利用分发指令
[root@CentOS01 .ssh]# scp id_dsa.pub CentOS02:`pwd`/CentOS01.pub
root@centos02's password:
id_rsa.pub 100% 603 0.6KB/s 00:00
[root@CentOS01 .ssh]#
scp id_rsa.pub CentOS02:`pwd`/CentOS01.pub
分发指令scp 将id_rsa.pub 分发给CentOS02的`pwd`(当前目录)/CentOS01.pub 更名为CentOS01.pub(为了知道是CentOS01分发的密钥)
- 将密钥追加到authorized_keys(公钥)中
[root@CentOS02 .ssh]# ll
total 8
-rw-r--r-- 1 root root 603 Dec 11 22:43 CentOS01.pub
-rw-r--r-- 1 root root 391 Dec 11 22:40 known_hosts
[root@CentOS02 .ssh]# cat CentOS01.pub >> authorized_keys
[root@CentOS02 .ssh]# ll
total 12
-rw-r--r-- 1 root root 603 Dec 11 22:47 authorized_keys
-rw-r--r-- 1 root root 603 Dec 11 22:43 CentOS01.pub
-rw-r--r-- 1 root root 391 Dec 11 22:40 known_hosts
[root@CentOS02 .ssh]#
将CentOS01的密钥分发给了CentOS02,并将密钥追加到authorized_keys中
- 分发给所有服务器
分发管理节点的密钥给所有机器,并将密钥追加到公钥中
利用管理节点ssh 其他主机 测试是是否免密钥
注意:测试时会改变管理节点身份,用exit退出当前身份,换到管理节点。
配置管理节点环境变量
vi /etc/profile
unset i
unset -f pathmunge
# 下面添加
export JAVA_HOME=/usr/java/jdk1.8.0_212
export HADOOP_HOME=/opt/hadoop
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
管理节点从伪分布式变成全分布式安装
- 将/opt/hadoop/etc下的hadoop做备份
意义:为了以后变成为分步好弄
修改管理节点hadoop配置文件
- 需要修改的文件以及意义
配置文件 | 功能描述 |
---|---|
hadoop-env.sh | hadoop的环境变量 |
yarn-env.sh | yarn运行的环境变量 |
core-site.xml | hadoop核心文件配置,可在其他配置文件中引用 |
hdfs-site.xml | hdfs配置文件,继承core-site.xml配置文件 |
mapred-site.xml | MapReduce配置文件,继承core-site.xml配置文件 |
yarn-site.xml | yarn配置文件,继承core-site.xml配置文件 |
- 修改 /opt/hadoop/etc/hadoop下core-site.xml文件
[root@CentOS01 hadoop]# vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://CentOS01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/zcx/hadoop/full</value>
</property>
</configuration>
- 修改hdfs-site.xml文件
[root@CentOS01 hadoop]# vi hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>CentOS02:50090</value>
</property>
</configuration>
NN:NameNode
SNN:secondaryNameNode
DN:DateNode
1.<name>dfs.replication</name>
<value>2</value> # 设置副本数(有几个从节点家最大数就是几)为2 (3也行,为了好观察设置成2)
2. <name>dfs.namenode.secondary.http-address</name>
<value>CentOS02:50090</value> # 设置secondaryNameNode服务器
- 修改 slaves (配置从节点的文件)
[root@CentOS01 hadoop]# vi slaves
CentOS02
CentOS03
CentOS04
配置了3个从节点,如果配置错了也没事可将DataNode转发到另一台机器
修改hadoop配置文件
-
修改hadoop-env.sh
(为了在不同主机中能正确找到java路径(将相对路径改为绝对路径))[root@CentOS01 hadoop]# vi hadoop-env.sh
在# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_212
java安装路径
保存退出 -
修改mapred-env.sh (计算框架)
[root@CentOS01 hadoop]# vi mapred-env.sh export JAVA_HOME=/usr/java/jdk1.8.0_212
java安装路径
保存退出 -
修改yarn-env.sh(存储框架)
vi yarn-env.sh # some Java parameters export JAVA_HOME=/usr/java/jdk1.8.0_212 #java安装路径
保存退出
以下为课上老师让做的。。。
6. 修改mapred-site.xml(指定MapReduce运行时框架为Yarn)
1. 复制一份mapred-site.xml
mapred-site.xml.template是 mapred-site.xml的模板
cp mapred-site.xml.template mapred-site.xml
修改
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
- 修改yarn-site.xml 配置yarn集群
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<!--yarn运行的节点 -->
<value>hadoop01</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
分发管理节点的hadoop文件
- 分发hadoop
[root@CentOS01 opt]# scp -r hadoop/ CentOS02:`pwd`
[root@CentOS01 opt]# scp -r hadoop/ CentOS03:`pwd`
[root@CentOS01 opt]# scp -r hadoop/ CentOS04:`pwd`
- 注意:路径
- 分发环境变量
[root@CentOS01 etc]# scp /etc/profile CentOS02:/etc/
profile 100% 1929 1.9KB/s 00:00
[root@CentOS01 etc]# scp /etc/profile CentOS03:/etc/
profile 100% 1929 1.9KB/s 00:00
[root@CentOS01 etc]# scp /etc/profile CentOS04:/etc/
profile 100% 1929 1.9KB/s 00:00
[root@CentOS01 etc]#
配置完成后
[root@CentOS04 opt]# . /etc/profile
测试其他机器:hd安tab会自动出来hdfs说明成功
配置工作完成,格式化集群
- 格式化
** 在管理节点 **
[root@CentOS01 opt]# hdfs namenode -format
# 出现
19/12/12 14:21:34 INFO common.Storage: Storage directory /var/zcx/hadoop/full/dfs/name has been succesfully formatted.
# 代表成功
- 启动
在管理节点
[root@CentOS01 etc]# start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [CentOS01]
CentOS01: starting namenode, logging to /opt/hadoop/logs/hadoop-root-namenode-CentOS01.out
CentOS04: starting datanode, logging to /opt/hadoop/logs/hadoop-root-datanode-CentOS04.out
CentOS03: starting datanode, logging to /opt/hadoop/logs/hadoop-root-datanode-CentOS03.out
CentOS02: starting datanode, logging to /opt/hadoop/logs/hadoop-root-datanode-CentOS02.out
Starting secondary namenodes [CentOS02]
CentOS02: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-root-secondarynamenode-CentOS02.out
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/logs/yarn-root-resourcemanager-CentOS01.out
CentOS04: starting nodemanager, logging to /opt/hadoop/logs/yarn-root-nodemanager-CentOS04.out
CentOS02: starting nodemanager, logging to /opt/hadoop/logs/yarn-root-nodemanager-CentOS02.out
CentOS03: starting nodemanager, logging to /opt/hadoop/logs/yarn-root-nodemanager-CentOS03.out
CentOS02: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-root-secondarynamenode-CentOS02.out
starting yarn daemons
secondarynamenode,设置的是CentOS02启动完毕
可根据启动说明来看 namenode secondarynamenode DataNode 启动是否完成
或者
在主节点
start-dfs.sh 启动hdfs所有集群
start-yarn.sh 启动yarn集群
效果相同建议使用
3. 效果
[root@CentOS01 etc]# jps
3190 Jps
2924 ResourceManager
2670 NameNode
[root@CentOS02 etc]# jps
1168 DataNode
1442 Jps
1218 SecondaryNameNode
1305 NodeManager
[root@CentOS03 var]# jps
1351 Jps
1162 DataNode
1247 NodeManager
[root@CentOS04 var]# jps
1382 Jps
1193 DataNode
1278 NodeManager
注意主机号和启动的jps进程
异常处理
若一台服务器没起来,那个没起来看哪个服务器的日志
日志目录:
/opt/hadoop/logs
[root@CentOS01 hadoop]# cd logs/
[root@CentOS01 logs]# ll
total 1136
-rw-r--r-- 1 root root 148316 Dec 12 13:43 hadoop-root-datanode-CentOS01.log
-rw-r--r-- 1 root root 715 Dec 12 13:42 hadoop-root-datanode-CentOS01.out
-rw-r--r-- 1 root root 715 Dec 12 03:57 hadoop-root-datanode-CentOS01.out.1
-rw-r--r-- 1 root root 715 Dec 12 03:44 hadoop-root-datanode-CentOS01.out.2
-rw-r--r-- 1 root root 715 Dec 12 03:29 hadoop-root-datanode-CentOS01.out.3
-rw-r--r-- 1 root root 715 Dec 12 01:48 hadoop-root-datanode-CentOS01.out.4
-rw-r--r-- 1 root root 715 Dec 12 00:42 hadoop-root-datanode-CentOS01.out.5
-rw-r--r-- 1 root root 280372 Dec 12 14:38 hadoop-root-namenode-CentOS01.log
-rw-r--r-- 1 root root 4908 Dec 12 14:37 hadoop-root-namenode-CentOS01.out
-rw-r--r-- 1 root root 715 Dec 12 13:42 hadoop-root-namenode-CentOS01.out.1
-rw-r--r-- 1 root root 4908 Dec 12 03:58 hadoop-root-namenode-CentOS01.out.2
-rw-r--r-- 1 root root 4908 Dec 12 03:46 hadoop-root-namenode-CentOS01.out.3
-rw-r--r-- 1 root root 715 Dec 12 03:29 hadoop-root-namenode-CentOS01.out.4
-rw-r--r-- 1 root root 715 Dec 12 01:48 hadoop-root-namenode-CentOS01.out.5
-rw-r--r-- 1 root root 174859 Dec 12 13:43 hadoop-root-secondarynamenode-CentOS01.log
-rw-r--r-- 1 root root 715 Dec 12 13:42 hadoop-root-secondarynamenode-CentOS01.out
-rw-r--r-- 1 root root 25220 Dec 12 04:23 hadoop-root-secondarynamenode-CentOS01.out.1
-rw-r--r-- 1 root root 2904 Dec 12 03:48 hadoop-root-secondarynamenode-CentOS01.out.2
-rw-r--r-- 1 root root 10648 Dec 12 03:40 hadoop-root-secondarynamenode-CentOS01.out.3
-rw-r--r-- 1 root root 715 Dec 12 01:48 hadoop-root-secondarynamenode-CentOS01.out.4
-rw-r--r-- 1 root root 715 Dec 12 00:42 hadoop-root-secondarynamenode-CentOS01.out.5
-rw-r--r-- 1 root root 0 Dec 12 00:42 SecurityAuth-root.audit
drwxr-xr-x 2 root root 4096 Dec 12 13:43 userlogs
-rw-r--r-- 1 root root 141929 Dec 12 13:44 yarn-root-nodemanager-CentOS01.log
-rw-r--r-- 1 root root 701 Dec 12 13:43 yarn-root-nodemanager-CentOS01.out
-rw-r--r-- 1 root root 701 Dec 12 03:57 yarn-root-nodemanager-CentOS01.out.1
-rw-r--r-- 1 root root 701 Dec 12 03:45 yarn-root-nodemanager-CentOS01.out.2
-rw-r--r-- 1 root root 701 Dec 12 03:29 yarn-root-nodemanager-CentOS01.out.3
-rw-r--r-- 1 root root 701 Dec 12 01:48 yarn-root-nodemanager-CentOS01.out.4
-rw-r--r-- 1 root root 216428 Dec 12 14:35 yarn-root-resourcemanager-CentOS01.log
-rw-r--r-- 1 root root 701 Dec 12 14:25 yarn-root-resourcemanager-CentOS01.out
-rw-r--r-- 1 root root 701 Dec 12 13:43 yarn-root-resourcemanager-CentOS01.out.1
-rw-r--r-- 1 root root 701 Dec 12 03:57 yarn-root-resourcemanager-CentOS01.out.2
-rw-r--r-- 1 root root 701 Dec 12 03:45 yarn-root-resourcemanager-CentOS01.out.3
-rw-r--r-- 1 root root 701 Dec 12 03:29 yarn-root-resourcemanager-CentOS01.out.4
-rw-r--r-- 1 root root 701 Dec 12 01:48 yarn-root-resourcemanager-CentOS01.out.5
日志为
-rw-r--r-- 1 root root 280372 Dec 12 14:38 hadoop-root-namenode-CentOS01.log
-rw-r--r-- 1 root root 148316 Dec 12 13:43 hadoop-root-datanode-CentOS01.log
的以 .log 结尾文件
[root@CentOS01 logs]# tail -100 hadoop-root-datanode-CentOS01.log da
查看 末尾100行文字 若有异常会出现error