Skip to content

Releases: EFord36/normalise

Improve nltk data dependency handling

05 May 11:18
Compare
Choose a tag to compare

The main purpose of this release is to improve the handling of nltk data dependencies.

In particular, the expand_EXPN function depends on nltk's 'averaged_perceptron_tagger' and 'universal_tagset' , but had a bare except that caught the LookupError raised when a user didn't have these dependencies installed, and just did a no-op on strings recognised as abbrevations. This seems to be the cause behind #109 and #117

Now, we explicitly handle and raise the LookupError, so it's clear to the user what's causing the problem and how to fix it.

More minor changes:

  • Add a section in the README to call out that these dependencies need installing, and giving commands for that
  • Add some missing import statements from code snippets in the README
  • Update author email in setup.py