This repo acts as a proof of concept and template for extending Spark's ML library using Scala. Textual data from Twitter is used to explore multi-stage pipelines
Special thanks to Holden whose talk at Spark Summit SF 2017 and book High Performance Spark gave me a great jumping off point