LDAExplore

The project uses the lda library from python which implements Latent Dirichlet Allocation (LDA) using Gibbs Sampling.

DOCUMENTATION

INSTALLATION

STEPS:

Install pip.
pip install numpy, scipy, scikit-learn, gensim
pip install nltk (NLTK is used for stopwords and some other stuff)

EXAMPLE

from processdata import fileops
from processdata.lda import LDAVisualModel

word_corpus = fileops.read_file('20_newsgroups/alt.atheism/53350')
lda = LDAVisualModel([word_corpus])
lda.create_word_corpus([word_corpus])
lda.train_lda(3)
topics = lda.get_lda_corpus()

print topics

------
RESULT
------

topic: <word distribution>

(SOURCE: From the documentation, so we dont have to go it again.)

DATA

The Reuters 20 newsgroup data set has been added to the repository.

PAPERS

1. Latent Dirchlet Allocation
   Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." the Journal of machine Learning research 3 (2003): 993-1022.

2. Probabistic Topic Models
   Blei, David M. "Probabilistic topic models." Communications of the ACM 55.4 (2012): 77-84.

3. Optimizing temporal topic segmentation for intelligent text visualization
   Pan, Shimei, et al. "Optimizing temporal topic segmentation for intelligent text visualization." Proceedings of the 2013 international conference on Intelligent user interfaces. ACM, 2013.

4. Active Learning with Constrained Topic Model
   Yang, Yi, et al. "Active Learning with Constrained Topic Model."

5. Visualizing Sets and Set-typed Data: State-of-the-Art and Future Challenges 
   Alsallakh, Bilal, et al. "Visualizing Sets and Set-typed Data: State-of-the-Art and Future Challenges (Supplementary Material)."

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
20_newsgroups		20_newsgroups
processdata		processdata
server		server
writeup		writeup
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
temp.py		temp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LDAExplore

DOCUMENTATION

INSTALLATION

EXAMPLE

DATA

PAPERS

About

Releases

Packages

Contributors 2

Languages

License

codehacken/LDAExplore

Folders and files

Latest commit

History

Repository files navigation

LDAExplore

DOCUMENTATION

INSTALLATION

EXAMPLE

DATA

PAPERS

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages