yarn理解与实践初级

最新推荐文章于 2025-04-14 22:54:26 发布

王小禾

最新推荐文章于 2025-04-14 22:54:26 发布

阅读量553

点赞数

分类专栏： YARN 文章标签： hadoop

本文链接：https://blog.csdn.net/answer100answer/article/details/89813134

版权

YARN 专栏收录该内容

3 篇文章

订阅专栏

本文围绕Yarn展开，介绍了其部署过程，包括核心配置文件和安装方式，还给出了wordcount示例及排错方法。同时详细阐述了Yarn的主从架构，如ResourceManager、NodeManager等组件的功能，以及调度程序和ApplicationsManager的作用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

yarn部署

核心配置文件

根据官网：http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html
需要配置mapred-site.xml和yarn-site.xml:

1.mapred-site.xml

	<configuration>
	    <!--基于yarn运行-->
	    <property>
	        <name>mapreduce.framework.name</name>
	        <value>yarn</value>
	    </property>
	</configuration>

2.yarn-site.xml

    <!-- 指定ResourceManager的地址-->
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop1</value>
    </property>
    <!-- 指定reducer获取数据的方式-->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

安装

见hadoop安装

yarn的web页面：

http://hadoop1:8088/cluster

启动yarn也可以用两种方式。

start-all.sh
start-yarn.sh
$ sbin/yarn-daemon.sh start resourcemanager
$ sbin/yarn-daemon.sh start nodemanager

示例

命令运行帮助：

[hadoop@hadoop1 yarn]$ yarn
Usage: yarn [--config confdir] [COMMAND | CLASSNAME]
jar <jar>                             run a jar file

使用官方自带的示例：

[hadoop@hadoop1 mapreduce]$ pwd
/home/hadoop/hadoop-current/share/hadoop/mapreduce

[hadoop@hadoop1 mapreduce]$ yarn hadoop-mapreduce-examples-2.7.6.jar
Error: Could not find or load main class hadoop-mapreduce-examples-2.7.6.jar
[hadoop@hadoop1 mapreduce]$ yarn jar hadoop-mapreduce-examples-2.7.6.jar
An example program must be given as the first argument.
Valid program names are:
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
  dbcount: An example job that count the pageview counts from a database.
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
  grep: A map/reduce program that counts the matches of a regex in the input.
  join: A job that effects a join over sorted, equally partitioned datasets
  multifilewc: A job that counts words from several files.
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
  randomwriter: A map/reduce program that writes 10GB of random data per node.
  secondarysort: An example defining a secondary sort to the reduce.
  sort: A map/reduce program that sorts the data written by the random writer.
  sudoku: A sudoku solver.
  teragen: Generate data for the terasort
  terasort: Run the terasort
  teravalidate: Checking results of terasort
  wordcount: A map/reduce program that counts the words in the input files.
  wordmean: A map/reduce program that counts the average length of the words in the input files.
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.

由以上的提示可知，可以计算圆周率，可以wordcount等等。

1.wordcount示例

编辑一个文件
wordfile.txt

hi,didi
hi,wanghongbing
hello,didi
welcome to didi

上传到hdfs上

[hadoop@hadoop1 file]$ hadoop fs -mkdir /whb
[hadoop@hadoop1 file]$ hadoop fs -put wordfile.txt /whb/
[hadoop@hadoop1 file]$ hadoop fs -ls /whb/
Found 1 items
-rw-r--r--   2 hadoop supergroup         51 2019-05-04 14:54 /whb/wordfile.txt

统计该文件的wordcount

[hadoop@hadoop1 mapreduce]$ yarn jar hadoop-mapreduce-examples-2.7.6.jar wordcount hdfs:///whb/wordfile.txt /output/wordcount/

说明：上述使用hdfs（可以直接/whb/wordfile.txt）文件（也可以使用本地）,结果放在hdfs的/output/wordcount/文件夹下。

运行过程：

[hadoop@hadoop1 mapreduce]$ yarn jar hadoop-mapreduce-examples-2.7.6.jar wordcount hdfs:///whb/wordfile.txt /output/wordcount/
19/05/04 14:59:34 INFO client.RMProxy: Connecting to ResourceManager at hadoop1/10.179.25.59:8032
19/05/04 14:59:35 INFO input.FileInputFormat: Total input paths to process : 1
19/05/04 14:59:35 INFO mapreduce.JobSubmitter: number of splits:1
19/05/04 14:59:35 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1556896316580_0001
19/05/04 14:59:36 INFO impl.YarnClientImpl: Submitted application application_1556896316580_0001
19/05/04 14:59:36 INFO mapreduce.Job: The url to track the job: http://hadoop1:8088/proxy/application_1556896316580_0001/
19/05/04 14:59:36 INFO mapreduce.Job: Running job: job_1556896316580_0001
19/05/04 14:59:44 INFO mapreduce.Job: Job job_1556896316580_0001 running in uber mode : false
19/05/04 14:59:44 INFO mapreduce.Job:  map 0% reduce 0%
19/05/04 14:59:48 INFO mapreduce.Job:  map 100% reduce 0%
19/05/04 14:59:55 INFO mapreduce.Job:  map 100% reduce 100%
19/05/04 14:59:55 INFO mapreduce.Job: Job job_1556896316580_0001 completed successfully
19/05/04 14:59:55 INFO mapreduce.Job: Counters: 49
	File System Counters
		FILE: Number of bytes read=93
		FILE: Number of bytes written=245973
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=152
		HDFS: Number of bytes written=63
		HDFS: Number of read operations=6
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters
		Launched map tasks=1
		Launched reduce tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=2705
		Total time spent by all reduces in occupied slots (ms)=3903
		Total time spent by all map tasks (ms)=2705
		Total time spent by all reduce tasks (ms)=3903
		Total vcore-milliseconds taken by all map tasks=2705
		Total vcore-milliseconds taken by all reduce tasks=3903
		Total megabyte-milliseconds taken by all map tasks=2769920
		Total megabyte-milliseconds taken by all reduce tasks=3996672
	Map-Reduce Framework
		Map input records=4
		Map output records=6
		Map output bytes=75
		Map output materialized bytes=93
		Input split bytes=101
		Combine input records=6
		Combine output records=6
		Reduce input groups=6
		Reduce shuffle bytes=93
		Reduce input records=6
		Reduce output records=6
		Spilled Records=12
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=369
		CPU time spent (ms)=1290
		Physical memory (bytes) snapshot=467714048
		Virtual memory (bytes) snapshot=4585644032
		Total committed heap usage (bytes)=366477312
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters
		Bytes Read=51
	File Output Format Counters
		Bytes Written=63

上述分析：

Map input records=4
Map output records=6

MR是按行来计算的，可以看到输入4行，输出6行。

在yarn 页面可以看到job
在hdfs可以看到产出结果。

查看结果：

[hadoop@hadoop1 mapreduce]$ hadoop fs -text /output/wordcount/part-r-00000
didi	1
hello,didi	1
hi,didi	1
hi,wanghongbing	1
to	1
welcome	1

上述是以空格分隔的做统计的。

_SUCCESS是空文件，做标识。

[hadoop@hadoop1 mapreduce]$ hadoop fs -text /output/wordcount/_SUCCESS
[hadoop@hadoop1 mapreduce]$

yarn页面显示：Application application_1556896316580_0001 中间指的是时间戳。

2.示例2：排错

再次提交相同的命令：

yarn jar hadoop-mapreduce-examples-2.7.6.jar wordcount hdfs:///whb/wordfile.txt /output/wordcount/

会报产出路径已经存在的错误。

[hadoop@hadoop1 mapreduce]$ yarn jar hadoop-mapreduce-examples-2.7.6.jar wordcount hdfs:///whb/wordfile.txt /output/wordcount/
19/05/04 22:47:18 INFO client.RMProxy: Connecting to ResourceManager at hadoop1/10.179.25.59:8032
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://hadoop1:9000/output/wordcount already exists
	at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
	at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)

一般在logs/下找.log文件。

yarn在跑任务时，除了ResourceManage、NodeManage进程，还会有yarn对应的进程，如RunJar。

yarn架构

主从架构

主：nameNode
从：dataNode
主：ResourceManager
从：NodeManager

ResourceManger
RM：全局的资源管理器，整个集群只有一个。负责集群资源的统一调度分配。负责对各个nodeManager上的资源进行统一的管理和调度。
ApplicationMaster
AM：用户提交的每个应用都包含一个AM
主要功能：与RM协商获取资源（用Container）;
与NM通信启动和停止任务；
监控所有任务运行状态，并在任务运行失败的时候重新申请资源用于重启任务。
NodeManager
NM：是每个节点上的资源和任务管理器。一方面它会定时向RM会报本节点上的资源使用情况和各个节点上的Container的运行状态；另一方面它会接收并处理来自AM的Container启动和停止请求。
Container
是yarn中的抽象概念，它封装cpu、内存、多个节点上的多维度资源
当AM向RM申请资源，RM返回给AM的资源便是Container
yarn会每个任务分配一个Container，且该任务只能使用该Container中描述的资源。

用官方文档的解释：
YARN的基本思想是将资源管理和作业调度/监视的功能分解为单独的守护进程。我们的想法是拥有一个全局ResourceManager（RM）和每个应用程序ApplicationMaster（AM）。应用程序可以是单个作业，也可以是作业的DAG。

ResourceManager和NodeManager构成了数据计算框架。 ResourceManager是在系统中的所有应用程序之间仲裁资源的最终权限。 NodeManager是每台机器框架代理，负责容器，监视其资源使用情况（CPU，内存，磁盘，网络）并将其报告给ResourceManager / Scheduler。

每个应用程序ApplicationMaster实际上是一个特定于框架的库，其任务是协调来自ResourceManager的资源，并与NodeManager一起执行和监视任务。

ResourceManager有两个主要组件：Scheduler和ApplicationsManager。

调度程序负责根据熟悉的容量，队列等约束将资源分配给各种正在运行的应用程序。调度程序是纯调度程序，因为它不执行应用程序状态的监视或跟踪。此外，由于应用程序故障或硬件故障，它无法保证重新启动失败的任务。调度程序根据应用程序的资源需求执行其调度功能; 它是基于资源Container的抽象概念，它包含内存，CPU，磁盘，网络等元素。

调度程序具有可插入策略，该策略负责在各种队列，应用程序等之间对集群资源进行分区。当前的调度程序（如CapacityScheduler和FairScheduler）将是插件的一些示例。

ApplicationsManager负责接受作业提交，协商第一个容器以执行特定于应用程序的ApplicationMaster，并提供在失败时重新启动ApplicationMaster容器的服务。每个应用程序ApplicationMaster负责从调度程序协商适当的资源容器，跟踪其状态并监视进度。