Yahoo Music recommendation with Linear regression and complex feature engineering.
Data set is million level so we use pyspark to clean the Data and doing the Matrix factorization.
The final result is reach the 0.88 accuracy in the oj system.
The main work is focusing on data cleaning and parameter training.