Python library for interactive topic model visualization. This is a port of the fabulous R package by Carson Sievert and Kenny Shirley.
pyLDAvis is designed to help users interpret the topics in a topic model that has been fit to a corpus of text data. The package extracts information from a fitted LDA topic model to inform an interactive web-based visualization.
The visualization is intended to be used within an IPython notebook but can also be saved to a stand-alone HTML file for easy sharing.
Note: LDA stands for latent Dirichlet allocation.
- Stable version using pip:
pip install pyldavis
- Development version on GitHub
Clone the repository and run python setup.py
The best way to learn how to use pyLDAvis is to see it in action. Check out this notebook for an overview. Refer to the documentation for details.
For a concise explanation of the visualization see this vignette from the LDAvis R package.
Ben Mabey walked through the visualization in this short talk using a Hacker News corpus:
Carson Sievert created a video demoing the R package. The visualization is the same and so it applies equally to pyLDAvis:
To read about the methodology behind pyLDAvis, see the original paper, which was presented at the 2014 ACL Workshop on Interactive Language Learning, Visualization, and Interfaces in Baltimore on June 27, 2014.