Spark
安装
上传压缩包并解压
修改用户权限
[root@master-tz src]# chown -R hadoop:hadoop spark
切换hadoop用户,进入目录
[hadoop@master-tz conf]$ pwd
/usr/local/src/spark/conf
修改spark-env.sh文件
[hadoop@master-tz conf]$ cp spark-env.sh.template spark-env.sh
[hadoop@master-tz conf]$ vim spark-env.sh
//添加如下内容
export JAVA_HOME=/usr/local/src/java
export HADOOP_HOME=/usr/local/src/hadoop
export SPARK_MASTER_IP=master-tz
export SPARK_MASTER_PORT=7077
export SPARK_DIST_CLASSPATH=$(/usr/local/src/hadoop/bin/hadoop classpath)
export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
export SPARK_YARN_USER_ENV="CLASSPATH=/usr/local/src/hadoop/etc/hadoop"
export YARN_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
修改slaves
[hadoop@master-tz conf]$ cp slaves.template slaves
[hadoop@master-tz conf]$ vim slaves
//添加如下
master-tz
slave01-tz
slave02-tz
将spark分发到slave1,slave2
[root@master-tz conf]# scp -r /usr/local/src/spark/ slave01-tz:/usr/local/src/
[root@master-tz conf]# scp -r /usr/local/src/spark/ slave02-tz:/usr/local/src/
发送完毕后,启动spark集群
[hadoop@master-tz conf]$ /usr/local/src/spark/sbin/start-all.sh
浏览器查看(ip:8080)
使用
本地运行模式测试。使用本地模式运行Spark Pi程序
/usr/local/src/spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master local[*] /usr/local/src/spark/examples/jars/spark-examples_2.11-2.0.0.jar 10
Standalone模式运行Spark Pi程序
[hadoop@master-tz conf]$ /usr/local/src/spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://master-tz:7077 /usr/local/src/spark/examples/jars/spark-examples_2.11-2.0.0.jar 10
运行成功后,在浏览器内也可以看到
Spark的集中运行模式
首先修改配置文件
<!--spark-env.sh文件,增加以下内容-->
export SPARK_DIST_CLASSPATH=$(/usr/local/src/hadoop/bin/hadoop classpath)
export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
export SPARK_YARN_USER_ENV="CLASSPATH=/usr/local/src/hadoop/etc/hadoop"
export YARN_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
<!--yarn-site.xml文件,增加以下内容-->
<property>
<name>yarn.nodemanager.pmem-checkenabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-checkenabled</name>
<value>false</value>
</property>
再次分发文件到slave01、slave02
yarn-cluster模式使用指令
[hadoop@master-tz hadoop]$ /usr/local/src/spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster /usr/local/src/spark/examples/jars/spark-examples_2.11-2.0.0.jar 2
yarn-client使用指令
/usr/local/src/spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client /usr/local/src/spark/examples/jars/spark-examples_2.11-2.0.0.jar 2