Skip to content
This repository has been archived by the owner on Nov 2, 2023. It is now read-only.

Latest commit

 

History

History
89 lines (64 loc) · 3.12 KB

README.md

File metadata and controls

89 lines (64 loc) · 3.12 KB

WeKeyPedia python toolkit Build Status Coverage Status

installation

using virtualenv

The pypi distribution is updated on important releases. During the development phase, this is approximatively every week.

$ mkdir e
$ virtualenv e/py
$ source e/py/bin/activate
(py)$ pip install wekeypedia
(py)$ python -m nltk.downloader punkt wordnet maxent_treebank_pos_tagger

using development version

If you need to get a up-to-last-second-update version, you might want to use the github master version. This is highly unstable. You both get work in progress features, their bugs and their bugfixes in realtime.

$ mkdir e
$ virtualenv e/py
$ source e/py/bin/activate
(py)$ pip install https://github.com/wekeypedia/toolkit-python/archive/master.zip
(py)$ python -m nltk.downloader punkt wordnet maxent_treebank_pos_tagger

usage

get the current content of a page

import wekeypedia

p = wekeypedia.WikipediaPage("Pi")
content = p.get_revision()

print content

parse diff result

diff = p.get_diff()
plusminus = p.extract_plusminus(diff)

p.print_plusminus_overview(plusminus)

count stems of a page

print p.count_stems([ content ])

examples and macros

You can explore the different current usages of the library by getting a look at the current we are using to build various datasets.

using virtualenv

$ virtualenv e/py --no-site-packages
$ source e/py/bin/activate
(py)$ pip install -r requirements.txt