Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommendation System Tutorial #11

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

shariqak14
Copy link

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this application. I think its shaping up to be an awesome and high impact tutorial.

Check out this similar [RealPython](

https://realpython.com/build-recommendation-engine-collaborative-filtering/#when-can-collaborative-filtering-be-used) tutorial on collaborative filtering. One of the missing pieces at the end is demonstrating effectiveness of the approach. It might even help to run through a step-by-step demo on a 3-by-3 rankings array so the user sees how the reshapes use the original data to predict new data.

tech document for the most part use 2nd person e.g. "In this tutorial, you will use NumPy..."\

in the you'll learn, you should be able to remove the you'll in each bullet


Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link for *.npy -> https://numpy.org/doc/stable/reference/generated/numpy.save.html

The dataset used in this tutorial is a subset of the MovieLens dataset. It consists of 90,570 ratings (1-5) from 943 users on 1680 movies.

Good place to describe the data row = movie, column = user, rating = 0 (no review), 1-5 is reviewed.

maybe a minimal example:

|movie|Joe|Jane|Bob|

|---|---|---|---|

Jaws|0 | 4 |1|

Godfather| 5 | 0 | 4

Star Wars |0 | 5 | 2

This is precisely why we will create a model-based Collaborative Filtering system.

first mention of model-based Collaborative filtering. Helps to keep vocab consistent throughout i.e. model-based recommendation system. Is the "Collaborative filtering" part of the original

First, let us begin Begin by ...


Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't used nonzero before, another way is

>>> np.sum(ratings == 0)

90570


Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need floats here? True and False are 1 and 0 integer objects


Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might help to describe "sparsity" here.

There's a few steps now describing some of the data exploration steps (very important steps for reproducibility, but doesn't directly influence model). May help to add some headings:

  • Data Exploration
  • Build model
  • train model
  • test model
  • wrap up

Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"best model" -> "best fit model"?

Here the loss function is a sum of squares error (SSE), maybe you could give the second $J$ a $J^*$ to indicate its a SSE + regularization parameter.

the functions have a number of reshape calls.

I think

J = 0.5 * np.sum((theta.T@X - ratings)**2)

where theta are the fitting parameters, X 


Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have some clarifying questions: _might help the tutorial clarity too_

X = params[: num_movies * num_features].reshape

Why does X need a reshape? Can you define it as num_movies x num_features?

theta = params[num_movies * num_features:].reshape

Similar, isn't this taking the original 1680x943 and shaping it into 1680x943 again?


Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, you switched from movies to items. "items" or "movies" work, but its better to stick with one or the other


Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this implementation of the modified Newton-Raphson method. Is your gradient 1680x943?


Reply via ReviewNB

@@ -0,0 +1,843 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How well did your prediction work?

It might help to have a testing-training discussion.


Reply via ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants