netflix_prize

This was a fun problem presented by Netflix in 2006 (with a million dollar carrot):

https://en.wikipedia.org/wiki/Netflix_Prize

http://www.netflixprize.com/

It became a distributed computing problem in addition to a data analysis problem. There are many different approaches one could take to try to solve it, here are just a couple.

mr-db2.pl - loading the movie rating data into a db
rmse.pl - calculating the RMSE of the results
wrap.csh - shell wrapper splitting the load up among systems
qfx.pl - approach #1 - This approach aims to match up a user's viewing tastes with someone else who has seen the movie in question and basing prediction on this.
wstd.pl - approach #2 - This approach allows for splitting the job up by taking arguments to selectively work only on some parts of the data. It then writes out results to disk. It also tries to increase performance by imposing limits on queries. It calculates std deveations to of user ratings to try to determine their reliability as predictors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

netflix_prize

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
mr-db2.pl		mr-db2.pl
qfx.pl		qfx.pl
rmse.pl		rmse.pl
wrap.csh		wrap.csh
wstd.pl		wstd.pl

License

simonibsen/netflix_prize

Folders and files

Latest commit

History

Repository files navigation

netflix_prize

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages