Skip to content

Home of Data Algorithms Book

Mahmoud Parsian edited this page May 23, 2015 · 20 revisions

Data Algorithms

Welcome!

to the Home of Data Algorithms book wiki!

Instructions (as BASH scripts) are provided for building and running sample codes in Data Algorithms book.

Webcast

All-vs-All: Correlation Using Spark/Hadoop(Thursday, July 23, 2015)

Author Book Signing

Author book signings for ("Data Algorithms") will be held in the O'Reilly booth on Thursday, Feb. 19, 2015. Complimentary copies of books will be provided for the first 25 attendees.

Introduction

All solutions (using Spark and Hadoop) in Data Algorithms book has been provided in Java. But I have a good news for Python programmers: Kashif Rasul has graciously volunteered to provide solutions in Python programming language.

Immediate Goal

My immediate goal is to provide at least two solutions for each chapter:

  • MapReduce/Hadoop Solution
  • Spark Solution (in Java, Scala, Python, ...)

Another goal is to provide compact Spark solutions in Java 8 (using Lambda Expressions).

Reorganization

My next immediate goal is to separate solutions for MapReduce/Hadoop and Spark. I will change package structure as (if you have suggestions/comments, please let me know):

org.dataalgorithms.<chapter-number>.spark
org.dataalgorithms.<chapter-number>.mapreduce

Posting New Content/Solution/Contribution

If you want to post a solution to any chapter, please let me know and full credit will be given to you.

Thanks,
best regards,
Mahmoud Parsian
[email protected]
Clone this wiki locally