Skip to content

A fun problem presented by the folks at Netflix in 2006.

License

Notifications You must be signed in to change notification settings

simonibsen/netflix_prize

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

netflix_prize

This was a fun problem presented by Netflix in 2006 (with a million dollar carrot):

https://en.wikipedia.org/wiki/Netflix_Prize

http://www.netflixprize.com/

It became a distributed computing problem in addition to a data analysis problem. There are many different approaches one could take to try to solve it, here are just a couple.

  • mr-db2.pl - loading the movie rating data into a db
  • rmse.pl - calculating the RMSE of the results
  • wrap.csh - shell wrapper splitting the load up among systems
  • qfx.pl - approach #1 - This approach aims to match up a user's viewing tastes with someone else who has seen the movie in question and basing prediction on this.
  • wstd.pl - approach #2 - This approach allows for splitting the job up by taking arguments to selectively work only on some parts of the data. It then writes out results to disk. It also tries to increase performance by imposing limits on queries. It calculates std deveations to of user ratings to try to determine their reliability as predictors.

About

A fun problem presented by the folks at Netflix in 2006.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages