官术网_书友最值得收藏!

Installing Spark

Follow these steps to install Spark 2.3.1, compiled with Hadoop 2.7:

  1. If you have a Spark 2.0 tar distribution (for example, spark-2.3.1-bin-hadoop2.7.tgz), then copy it into your Linux VM at any location (for example, /opt) using any Windows on Linux file transfer software (FileZilla or WinSCP). Alternatively, you can download the latest binary .tar.gz file from the following Apache Spark link: http://spark.apache.org/downloads.html.
The /opt file is an empty folder within root in most Linux-based operating folders. Here, we would use this folder to copy and install software. By default, this folder is owned by Root. So, run the following command if you are getting permission issues while accessing this folder.
  sudo chmod -R 777 /opt.
  1. Go to the location where you have copied the Spark software package and uncompress it:
cd /opt
tar -xzvf spark-2.3.1-bin-hadoop2.7.tgz
  1. Set the environment variable in .bash_profile, as follows:
nano ~/.bash_profile 
  1. Add the following lines to the end of the file:
export SPARK_HOME=/opt/spark-2.3.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/sbin
export PATH=$PATH:$SPARK_HOME/bin
  1. Run the following command to update the environment variables in the current session:
source ~/.bash_profile
主站蜘蛛池模板: 明溪县| 广东省| 夹江县| 钟山县| 宜州市| 随州市| 福贡县| 池州市| 吉隆县| 宁津县| 新乐市| 稻城县| 全南县| 冕宁县| 三江| 特克斯县| 原阳县| 新民市| 新野县| 黎平县| 曲阜市| 油尖旺区| 新郑市| 大渡口区| 凤翔县| 阿克陶县| 法库县| 晋江市| 石台县| 广州市| 梁平县| 晋江市| 凉城县| 根河市| 中方县| 陆良县| 全州县| 九寨沟县| 凤山县| 缙云县| 鲁山县|