Big Data Hacks: Spark Installation on Windows

Monday, March 13, 2017

Spark Installation on Windows

Choose a Spark pre built package for Hadoop i.e. Pre-built for Hadoop 2.6 or later. Download and extract it to any drive i.e. D:\spark-2.1.0-bin-hadoop2.6
Set SPARK_HOME and add %SPARK_HOME%\bin in PATH in environment variables
Run following command on command line.

You’ll get an error for winutils.exe:

Though we aren’t using Hadoop with Spark, but somewhere it checks for HADOOP_HOME variable in configuration. So to overcome this error, download winutils.exe and place it in any location (i.e. D:\winutils\bin\winutils.exe).

P.S. As per the Operating system version, this winutils.exe may vary. So in case, if it doesn't support to your OS, please find another one and use. You can refer this Problems running Hadoop on Windows link for winutils.exe.

Set HADOOP_HOME = D:\winutils in environment variable
Now, Re run the command "spark-shell", you’ll see the scala shell. For latest spark releases, if you get the permission error for /tmp/hive directory as given below:

The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-

You need to run following command :

D:\spark>D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive

For Spark UI : open http://localhost:4040/in browser.

Big Data Hacks

Monday, March 13, 2017

Spark Installation on Windows

No comments:

Post a Comment

Apache Spark – Catalyst Optimizer

Search This Blog