Skip to content

This module allows users to extract various parts of a Wikipedia article's revision history.

License

Notifications You must be signed in to change notification settings

atlijas/wikipedia-revision-miner

Repository files navigation

Wikipedia Revision Miner

WikipediaRevisionMiner (WRM) is a toolkit used for mining Wikipedia articles' revision history. Additional regular expressions to rev_regex.py are needed in order to use WRM for other languages than Icelandic. Apart from that, it's language independent.

Basic usage

from revision_history import RevisionHistory

language = 'is'
RH = RevisionHistory(f'{language}', 'Knattspyrna')
for sentence_pair in RH.make_pairs():
    print(sentence_pair)

About

This module allows users to extract various parts of a Wikipedia article's revision history.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages