SparkDeepDoc Some deep resources from apache spark, cloudera, my practice and so on. Most important is what i think. Streaming 1: https://databricks.com/blog/2015/01/28/introducing-streaming-k-means-in-spark-1-2.html 2: Flink 核心技术介绍 by Flink Committer Machine learning 1: K-D tree 2: An overview of SGD 3:trade-off of variance and bias