This will help you install all the workshop dependencies on your local workstation.
For MacOSX, many of these should already exist. You'll probably only have to install Scala, Gradle and Luigi manually. Consider cnce you have everything installed, run the verify commands from the Linux section below to double-check.
Use the VM_Setup
-
Add Java, Scala and Gradle:
Java:
sudo apt install openjdk-8-jdk
Scala:
wget www.scala-lang.org/files/archive/scala-2.11.8.deb
sudo dpkg -i scala-2.11.8.deb
Gradle:
sudo apt install gradle
Note: In some environments Gradle 3.4 doesn't work well with the scala plugin. It's recommended you use the gradle wrapper to use a more updated version of gradle. Run
gradle wrapper --gradle-version=4.9
and then use the gradlew./gradlew
. See this stack overflow post for more info. -
If not already installed, get Python 2.7 and latest pip.
sudo apt update
sudo apt upgrade
sudo apt install python2.7 python-pip
To verify steps 1 and 2:
java -version
Expected output:openjdk version "1.8.0_191"
or similar 1.8 version.
scala -version
Expected outputScala code runner version 2.11.8 -- Copyright 2002-2016, LAMP/EPFL
or similar 2.11 version.
gradle -version
Expected output:Gradle 3.4.1
or similar 3.4 version. (Although, see note about using the gradle wrapper instead of relying on Gradle 3.4)
python —version
Expected output:Python 2.7.15rc1
or similar version. (2.7.12, 2.17.13, etc.)
pip —version
Expected output:pip 9.0.1 from /usr/lib/python2.7/dist-packages (python 2.7)
or similar version. -
Install Luigi:
sudo pip install luigi
if using system-install of python. If using a local install like with pyenv don’t use sudo.
Runluigi
Expected output:No task specified
-
Get and unpack Spark:
wget https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
sudo tar -xzvf spark-2.2.1-bin-hadoop2.7.tgz --directory /opt/
sudo mv /opt/spark-2.2.1-bin-hadoop2.7 /opt/spark-2.2.1
-
Setup bash profile for Spark:
vi ~/.bash_profile
Enter the below
export SPARK_HOME=/opt/spark-2.2.1
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$SPARK_HOME/bin:$JAVA_HOME/bin:$PATH
source .bash_profile
Run the command: spark-submit
Expected output Usage: spark-submit [options] ...
plus many more lines.