A little word cloud generator in Python. Read more about it on the blog post or the website. The code is Python 2, but Python 3 compatible.
If you are using conda, you can install from the conda-forge channel:
conda install -c conda-forge wordcloud
If you don't use conda, you can install via pip, but that will require having a C compiler set up:
pip install wordcloud
For a manual install get this package:
wget https://github.com/amueller/word_cloud/archive/master.zip
unzip master.zip
rm master.zip
cd word_cloud-master
Install the package:
python setup.py install
worcloud depends on numpy>=1.5.1, pillow and matplotlib. To install it via pip, you will also need a C compiler.
If you're having trouble with pip installation on windows, you can find a .whl file at:
http://www.lfd.uci.edu/~gohlke/pythonlibs/#wordcloud
If the installation of the package fails, due to a missing pyconfig.h
file, you need to install the python-dev package.
For Python 2.*
sudo apt-get install python-dev
For Python 3.*
sudo apt-get install python3-dev
If the compilation via gcc of the package fails, due to a missing Python.h
file, you need to install the python-devel package.
For Python 2.*
sudo yum install -y python-devel
For Python 3.*
sudo yum install -y python34-devel
Check out examples/simple.py for a short intro. A sample output is:
Or run examples/masked.py to see more options. A sample output is:
Getting fancy with some colors:
The wordcloud_cli.py
tool can be used to generate word clouds directly from the command-line:
$ wordcloud_cli.py --text mytext.txt --imagefile wordcloud.png
If you're dealing with PDF files, then pdftotext
, included by default with many Linux distribution, comes in handy:
$ pdftotext mydocument.pdf - | wordcloud_cli.py --imagefile wordcloud.png
In the previous example, the -
argument orders pdftotext
to write the resulting text to stdout, which is then piped to the stdin of wordcloud_cli.py
.
Use wordcloud_cli.py --help
so see all available options.
The wordcloud library is MIT licenced, but contains DroidSansMono.ttf, a true type font by Google, that is apache licensed.
The font is by no means integral, and any other font can be used by setting the font_path
variable when creating a WordCloud
object.