Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while fit with 200k user_interaction matrix, item features and user features #37

Open
seuriously opened this issue Nov 18, 2021 · 3 comments

Comments

@seuriously
Copy link

I'm running the lib on a virtual server with 64gb RAM.
My data consist of:
200k distinct interaction between users and item
52k x 11 user_feature matrix
2770 x 49 item_feature matrix
all NA are replaced by 0

when i try to run it gives me this error:
AssertionError: user factors [v_u] are not finite - try decreasing feature/sample_weight magnitudes
sometimes it would give me item factors error as well

However, if I run on 170k user interaction without user_features and item_features it would run smoothly

What is the meaning of the error?

@ZoeLeung2021
Copy link

Hi I have the same issue as you. Was wondering if you've solved the issue or not?

I have 173k distinct interaction between users and item
4k x 20 item features dataframe
370k x 30 user feature dataframe

when i try to run it gives me this error:
AssertionError: item weights [w_i] are not finite - try decreasing feature/sample_weight magnitudes

so now i can only run the model without the item and user auxiliary features.

@jonathanswalton77
Copy link

I too have this same error. Even with my item features dataframe comprising two columns [product_id INT32, retailprice FLOAT64] of 270 rows. I try also with two different columns, [product_id, category_id INT16], and it's the same issue.

I can include sample_weight. If I try to add any user or product attributes I get the error; AssertionError: item weights [w_i] are not finite

model2 = RankFM(factors=20, loss='warp', max_samples=100, learning_schedule='invscaling')
model2.fit(user_item_train, 
           item_features=item_attributes_train, 
           #user_features=user_attributes_train,
           sample_weight=sample_weight_train, 
           epochs=25, 
           verbose=True)

type(item_attributes_train)
pandas.core.frame.DataFrame

item_attributes_train.dtypes
PRODUCT_ID int32
RETAIL_PRICE float64
dtype: object

item_attributes_train.head(3)
PRODUCT_ID RETAIL_PRICE
0 10162 1.75
1 10145 1.00
2 101433 7.95

@jonathanswalton77
Copy link

I may have resolved this issue that I'm facing by re-presenting all the numeric data as scaled values between 0 and 1, and with categories IDs being one-hot encoded (as you'd expect).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants