Skip to content

Latest commit

 

History

History
32 lines (16 loc) · 774 Bytes

README.md

File metadata and controls

32 lines (16 loc) · 774 Bytes

KDD2020Multimodalities

KDD Cup 2020 Challenges for Modern E-Commerce Platform: Multimodalities Recall

Baseline Score: 0.5538(使用四分之一数据) | 0.599(使用一半的数据)

方案详解:https://zhuanlan.zhihu.com/p/135984016

https://fasttext.cc/docs/en/english-vectors.html crawl-300d-2M.vec.zip

https://nlp.stanford.edu/projects/glove/ glove.840B.300d.zip

https://github.com/google-research/bert BERT-Base, Uncased

数据全部存放在data目录下

运行顺序

1.preprocess.py(只读了1w的数据,要读取更多数据,谨慎修改函数pd.read_csv的参数,有大内存机器的忽略)

2.[image-concat-query]-wwm_uncased_L12-768_v3_quart.ipynb

环境和依赖

python 3.7

tensorflow==1.13.1

Keras==2.2.4

keras-bert==0.78.0