Skip to content

This is a try to do topic modellng on papers from awesome-deep-learning-papers repo

Notifications You must be signed in to change notification settings

tim1234ltp/understanding-awesome-deep-learning-papers

Repository files navigation

NLP on deep-learning-papers

This a try to do topic modellng on best 100 papers from github repo awesome-deep-learning-papers.

From the repo, we should have 100 papers but during the crawling with script, the access towards one of them (Human-level control through deep reinforcement learning) is blocked.

Then, a script and pdftotext is used to parse pdfs to plain texts.
In find_topics.py, we concatenate all plain texts to papers.txt which is of size 4 MB. This means there is about 4000000 characters in the data.
The gensim library is used as it is tailored for topic modelling. The findings are visualized by pyLDAvis library and stored as .html.

About

This is a try to do topic modellng on papers from awesome-deep-learning-papers repo

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published