Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于模型的小疑问 #7

Open
moluxiaobei opened this issue Jan 19, 2018 · 2 comments
Open

关于模型的小疑问 #7

moluxiaobei opened this issue Jan 19, 2018 · 2 comments

Comments

@moluxiaobei
Copy link

博士学姐,我最近用了你的思路实验了下用户序列预测,用户之前的点击item作为q,后续的item是否点击作为d,实验的auc达到0.6,不过训练达到0.7可能过拟合了,商品维度是100百万维,参数大约2亿多,当然比单纯的商品ctr基线要好。我现在疑问几点。1)过拟合是怎么出现的,好像有点轻微拟合。2)embeding后kernel Pooling起到特征抽取的作用,那么这次实验有效果是由于这层起到重要作用还是?好像能学习到用户这之前的点击序列上能大概知道下一时刻想要的是什么?3)在有些实验样本上表现比较差,分数区分性很小,有的序列下不同商品相关性分数都一样,这个问题?期待你的回复,先谢谢了!

@AdeDZY
Copy link
Owner

AdeDZY commented Jan 19, 2018

  1. 可以考虑early stopping或者加dropout。100M的商品,感觉确实容易overfit. 有没有试过把某些商品过滤掉?文本搜索中我们会把低频词删掉,这样word embedding layer的纬度能变小很多。
  2. 我不太明白你的方法具体是怎么做的。能举个例子,说说q和d分别是什么样子的? 比如 q: 'clicked_item1', d:'new_item1', 或者 q:'clicked_item1, clicked_item2, clicked_item3', d:'new_item1'……?
    3)一个原因可能是最后一层激活函数tanh把分数都压缩在-1到1之间了,可以把tanh去掉试试。另一个可能是这些商品在训练数据中出现的频率不够多,所以它们的embedding并没有被充分学习过。

@moluxiaobei

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants