1. 读取文件
- ./bin/spark-submit examples/src/main/python/wordcount.py file:///home/hadoop/coder_oyang/tst #读取本地文件
- ./bin/spark-submit examples/src/main/python/wordcount.py file:///home/hadoop/coder_oyang/ #读取本地文件夹
- ./bin/spark-submit examples/src/main/python/wordcount.py file:///home/hadoop/coder_oyang/tst* #读取本地多个文件
- 读取集群文件,将文件路径中的 file://去掉即可
2.
3. 系统找不到指定的批标签 make_command_arguments |hadoop windows出错
- 将Hadoop安装目录下面的bin目录下,hadoop.cmd、hdfs.cmd、mapred.cmd、yarn.cmd中所有call对应的行,删除前面的空格;
- 在yarn.cmd中,如果yarncommands中带^,如下:
set yarncommands=resourcemanager nodemanager proxyserver rmadmin version jar application ^
applicationattempt container node logs daemonlog historyserver --- 将^删除,变成一行