StarSpace selection of positive/negative example, and usage for multiple types of items #276

nirlotan · 2019-10-08T09:26:43Z

Hi,

As part of a research in the context of collaborative filtering preformed by a group of researches in Haifa University, we've been trying to use your StarSpace framework in order to benchmark CF results for recommendations of different types of items, and refer to your paper on this topic.

We've been using StarSpace training mode = 1, and have a couple of questions that we will highly appreciate if you can answer and help us understand.

What is the method you are using for generating positive and negative examples given an input file? I've added some traces into the code, and I do see that you select the examples randomly, but cannot detect a pattern. a. For example, given an input line with items A1, A2, A3, A4, A5, - would you compare each item with the rest of the items in the line? for example - for A1 - would you compare it with each of the remaining items? ({A1,A2}, {A1,A3}, {A1,A4},{A1,15})? From what I see in the code this is not necessarily the case, and you randomly select pairs for each epoch. is that correct?b. How do you select the negative examples? do you randomly select them from all items that are excluded from the input line? Again based on my traces I saw that there is random selection, but the dictionary from which you select is not clear to me. Also - is it possible that you select from the list of items in the line also negative example (which shouldn't be the case)?
Next we would like to train the model to work with different types of items, and infer only on one of those types. It wasn't clear to me from the documentation if it is enough to use a different prefix for the items, or should we do anything else? For example, is it enough to provide the items in this format: A1, A2, A3 ..., B1, B2, B3.... C1, C2, C3... to designate three types of items (type A, type B, type C), and then try to infer on items from type A alone? I'm asking because when doing so - I got much lower accuracy rates, which didn't make sense to me. Should we continue to use training mode 1 for this case, or should we switch to a different training mode.

If you have reached so far - I want to thank you for reading this long message, and your willingness to support college researchers. We are looking forward to using your framework and referring to it in our research. Once completed I will also be happy to contribute the wrapping framework which we have created in order to run multiple StarSpace experiments using python.

Thanks again!
Nir.

baiduzhaozhuo · 2019-12-30T05:10:01Z

the first problem can be explained from the source code and paper. for example, there are three samples, [(A1, A2, A3, A4,A5), (B1, B2, B3, B4, B5), (C1, C2, C3,C4)] which can be described as user click sequences. it splits each sample into two parts, the RHS( right hand side) and LHS( left hand side). in each sample, RHS can be regarded as label, which is seleceted randomly from the LHS, and the left items as LHS. so we get the three sampels as follows: [(A1, A2, A4, A5):A3, (B1, B3, B4, B5): B2, (C2, C3,C4):C1]. next, it makes sum(LHS) as 'a', RHS as 'b+', so , (a, b+) as positive sample pair. 'b-' for each of k is selected randomly from the total set of RHS, so , (a, b-) is one negative sample pair. at last, it run formation L(sim(a, b+), sim(a, b-) ...) as loss function。hope to help you。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StarSpace selection of positive/negative example, and usage for multiple types of items #276

StarSpace selection of positive/negative example, and usage for multiple types of items #276

nirlotan commented Oct 8, 2019

baiduzhaozhuo commented Dec 30, 2019

StarSpace selection of positive/negative example, and usage for multiple types of items #276

StarSpace selection of positive/negative example, and usage for multiple types of items #276

Comments

nirlotan commented Oct 8, 2019

baiduzhaozhuo commented Dec 30, 2019