Skip to content

Latest commit

 

History

History
65 lines (51 loc) · 2.88 KB

Local_Setup.md

File metadata and controls

65 lines (51 loc) · 2.88 KB

This will help you install all the workshop dependencies on your local workstation.

For MacOSX

For MacOSX, many of these should already exist. You'll probably only have to install Scala, Gradle and Luigi manually. Consider cnce you have everything installed, run the verify commands from the Linux section below to double-check.

For Windows

Use the VM_Setup

For Linux:

  1. Add Java, Scala and Gradle:

    Java:
    sudo apt install openjdk-8-jdk

    Scala:
    wget www.scala-lang.org/files/archive/scala-2.11.8.deb
    sudo dpkg -i scala-2.11.8.deb

    Gradle:
    sudo apt install gradle

    Note: In some environments Gradle 3.4 doesn't work well with the scala plugin. It's recommended you use the gradle wrapper to use a more updated version of gradle. Run gradle wrapper --gradle-version=4.9 and then use the gradlew ./gradlew. See this stack overflow post for more info.

  2. If not already installed, get Python 2.7 and latest pip.
    sudo apt update
    sudo apt upgrade
    sudo apt install python2.7 python-pip

    To verify steps 1 and 2:

    java -version
    Expected output: openjdk version "1.8.0_191" or similar 1.8 version.

    scala -version
    Expected output Scala code runner version 2.11.8 -- Copyright 2002-2016, LAMP/EPFL or similar 2.11 version.

    gradle -version
    Expected output: Gradle 3.4.1 or similar 3.4 version. (Although, see note about using the gradle wrapper instead of relying on Gradle 3.4)

    python —version
    Expected output: Python 2.7.15rc1 or similar version. (2.7.12, 2.17.13, etc.)

    pip —version
    Expected output: pip 9.0.1 from /usr/lib/python2.7/dist-packages (python 2.7) or similar version.

  3. Install Luigi:

    sudo pip install luigi if using system-install of python. If using a local install like with pyenv don’t use sudo.

    Run luigi Expected output: No task specified

  4. Get and unpack Spark:
    wget https://archive.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
    sudo tar -xzvf spark-2.2.1-bin-hadoop2.7.tgz --directory /opt/
    sudo mv /opt/spark-2.2.1-bin-hadoop2.7 /opt/spark-2.2.1

  5. Setup bash profile for Spark:
    vi ~/.bash_profile
    Enter the below

export SPARK_HOME=/opt/spark-2.2.1
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$SPARK_HOME/bin:$JAVA_HOME/bin:$PATH

source .bash_profile
Run the command: spark-submit
Expected output Usage: spark-submit [options] ... plus many more lines.