Tutorial of Spark SQL on HadoopCon 2015. This is a file of IPython notebook/Jupyter by using the Python language.
This training material requires Spark 1.4.1
In this tutorial, you will learn how to initialize Spark SQL with SQLContext (HiveContext), manipulate DataFrames, import data, user defined functions, and operate cache(). For example,
This is a link of Spark SQL
Spark 1.4.1:
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)
Please check Spark SQL and DataFrame Guide and Apache Spark for more details.