- Apache Spark Quick Start Guide
- Shrey Mehrotra Akash Grade
- 196字
- 2021-07-02 13:39:59
Installing Spark
Follow these steps to install Spark 2.3.1, compiled with Hadoop 2.7:
- If you have a Spark 2.0 tar distribution (for example, spark-2.3.1-bin-hadoop2.7.tgz), then copy it into your Linux VM at any location (for example, /opt) using any Windows on Linux file transfer software (FileZilla or WinSCP). Alternatively, you can download the latest binary .tar.gz file from the following Apache Spark link: http://spark.apache.org/downloads.html.
The /opt file is an empty folder within root in most Linux-based operating folders. Here, we would use this folder to copy and install software. By default, this folder is owned by Root. So, run the following command if you are getting permission issues while accessing this folder.
sudo chmod -R 777 /opt.
sudo chmod -R 777 /opt.
- Go to the location where you have copied the Spark software package and uncompress it:
cd /opt
tar -xzvf spark-2.3.1-bin-hadoop2.7.tgz
- Set the environment variable in .bash_profile, as follows:
nano ~/.bash_profile
- Add the following lines to the end of the file:
export SPARK_HOME=/opt/spark-2.3.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/sbin
export PATH=$PATH:$SPARK_HOME/bin
- Run the following command to update the environment variables in the current session:
source ~/.bash_profile
推薦閱讀
- Ansible Configuration Management
- 大數(shù)據(jù)戰(zhàn)爭:人工智能時代不能不說的事
- 協(xié)作機器人技術(shù)及應(yīng)用
- Java實用組件集
- Apache Superset Quick Start Guide
- 數(shù)據(jù)掘金
- Windows Server 2003系統(tǒng)安全管理
- SAP Business Intelligence Quick Start Guide
- Working with Linux:Quick Hacks for the Command Line
- Windows安全指南
- Artificial Intelligence By Example
- C#求職寶典
- 基于人工免疫原理的檢測系統(tǒng)模型及其應(yīng)用
- Machine Learning with Spark(Second Edition)
- Cloudera Hadoop大數(shù)據(jù)平臺實戰(zhàn)指南