Spark

安装

上传压缩包并解压

68137587637

修改用户权限

[root@master-tz src]# chown -R hadoop:hadoop spark

切换hadoop用户,进入目录

[hadoop@master-tz conf]$ pwd
/usr/local/src/spark/conf

修改spark-env.sh文件

[hadoop@master-tz conf]$ cp spark-env.sh.template spark-env.sh
[hadoop@master-tz conf]$ vim spark-env.sh
//添加如下内容
export JAVA_HOME=/usr/local/src/java
export HADOOP_HOME=/usr/local/src/hadoop
export SPARK_MASTER_IP=master-tz
export SPARK_MASTER_PORT=7077
export SPARK_DIST_CLASSPATH=$(/usr/local/src/hadoop/bin/hadoop classpath)
export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
export SPARK_YARN_USER_ENV="CLASSPATH=/usr/local/src/hadoop/etc/hadoop"
export YARN_CONF_DIR=/usr/local/src/hadoop/etc/hadoop

修改slaves

[hadoop@master-tz conf]$ cp slaves.template slaves
[hadoop@master-tz conf]$ vim slaves
//添加如下
master-tz
slave01-tz
slave02-tz

68223988058

将spark分发到slave1,slave2

[root@master-tz conf]# scp -r /usr/local/src/spark/ slave01-tz:/usr/local/src/
[root@master-tz conf]# scp -r /usr/local/src/spark/ slave02-tz:/usr/local/src/

68224013701

发送完毕后,启动spark集群

[hadoop@master-tz conf]$ /usr/local/src/spark/sbin/start-all.sh

68224033758

浏览器查看(ip:8080)

68224040218

使用

本地运行模式测试。使用本地模式运行Spark Pi程序

/usr/local/src/spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master local[*] /usr/local/src/spark/examples/jars/spark-examples_2.11-2.0.0.jar 10

68224129012

Standalone模式运行Spark Pi程序

[hadoop@master-tz conf]$ /usr/local/src/spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://master-tz:7077 /usr/local/src/spark/examples/jars/spark-examples_2.11-2.0.0.jar 10

68224157035

运行成功后,在浏览器内也可以看到

68224161345

Spark的集中运行模式

首先修改配置文件

<!--spark-env.sh文件,增加以下内容-->
export SPARK_DIST_CLASSPATH=$(/usr/local/src/hadoop/bin/hadoop classpath)
export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
export SPARK_YARN_USER_ENV="CLASSPATH=/usr/local/src/hadoop/etc/hadoop"
export YARN_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
<!--yarn-site.xml文件,增加以下内容-->
<property>
<name>yarn.nodemanager.pmem-checkenabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-checkenabled</name>
<value>false</value>
</property>

68224257337

再次分发文件到slave01、slave02

68224269037

yarn-cluster模式使用指令

[hadoop@master-tz hadoop]$ /usr/local/src/spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster /usr/local/src/spark/examples/jars/spark-examples_2.11-2.0.0.jar 2

68224353111

68224351442

yarn-client使用指令

/usr/local/src/spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client /usr/local/src/spark/examples/jars/spark-examples_2.11-2.0.0.jar 2

68224363064

68224365893

最后修改:2023 年 05 月 27 日
如果觉得我的文章对你有用,请随意赞赏