Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about HFT algorithm #247

Closed
ChenSuL opened this issue Mar 27, 2018 · 4 comments
Closed

Question about HFT algorithm #247

ChenSuL opened this issue Mar 27, 2018 · 4 comments

Comments

@ChenSuL
Copy link
Contributor

ChenSuL commented Mar 27, 2018

I find the performance of this algorithm is not good! According to the paper,the HFT model adds a regularization term on the basis of the BiasedMF model. So, does its performance should be better than the Biasedmf model?

In addition, the execution time of this algorithm is very long and cann't scale to large data sets. So, could you implement the L-BFGS training algorithm.

Thank you!

@ChenSuL ChenSuL changed the title Question about the implementation of BPR Question about HFT algorithm Mar 27, 2018
@SunYatong
Copy link
Collaborator

Hi @ChenSuL , thanks for your feedback.

  • I tested your code and found that "trainMatrix.get(u, j)!=0" is more efficient than "itemSet.contains(negItemIdx)".
  • And I found that caching each user's itemList is more efficient than their SparseVectors.

Here is my code:

        // cache each user's item list
        List<List<Integer>> userItemsList = new ArrayList<>();
        for (int userIdx = 0; userIdx < numUsers; ++userIdx) {
            userItemsList.add(new ArrayList(trainMatrix.getColumns(userIdx)));
        }

        for (int iter = 1; iter <= numIterations; iter++) {
            loss = 0.0d;
            for (int sampleCount = 0; sampleCount < numUsers * 100; sampleCount++) {
                // randomly draw (userIdx, posItemIdx, negItemIdx)
                int userIdx, posItemIdx, negItemIdx;

                while (true) {
                    userIdx = Randoms.uniform(numUsers);

                    List<Integer> itemList = userItemsList.get(userIdx);
                    if (itemList.size() == 0 || itemList.size() == numItems)
                        continue;

                    posItemIdx = itemList.get(Randoms.uniform(itemList.size()));
                    do {
                        negItemIdx = Randoms.uniform(numItems);
                    } while (trainMatrix.get(userIdx, negItemIdx)!=0);

                    break;
                }

@SunYatong
Copy link
Collaborator

SunYatong commented Mar 27, 2018

I do have tested the performance of BiasedMF and HFT (with its authors' code), and the improvement of HFT is very limited, even worse in some cases.

The L-BFGS training might be implemented in the future, but we haven't made a specific plan about that.

BTW, does my BPR implementation presented above run faster than your version on your equipment?

@ChenSuL
Copy link
Contributor Author

ChenSuL commented Mar 27, 2018

Thank you for your kind reply @SunYatong

There's no problem in your BPR implementation. So sorry, I put forward the question about BPR by mistake. Because I forgot that the version I used is 1.3! The problem has been solved very well in version 2.0.

@SunYatong
Copy link
Collaborator

Thank you all the same. It is still a valuable enhancement as I found that "trainMatrix.get(u, j)!=0" is more efficient than "itemSet.contains(negItemIdx)".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants