-
-
Notifications
You must be signed in to change notification settings - Fork 58
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: update category matching algorithm (#952)
* feat: updating spacy dependency We need a more recent version of spaCy for lemmatization Also add: - spacy-lookups-data for lemmatizer lookup tables - cachetools for TTLCache * feat: add get_lemmatizing_nlp that returns pipeline with lemmatizer * feat: add new category matching algorithm * feat: switch from category ES matching to new matching algorithm * feat: add matcher algorithm to /predict/category endpoint * doc: document predict-category CLI command * fix: improve category matching algorithm and APIs after code review * fix: fix mypy warning on OCR script * fix: fix mkdocs building issue
- Loading branch information
1 parent
f97cc8c
commit d8a04c7
Showing
33 changed files
with
1,229 additions
and
1,744 deletions.
There are no files selected for viewing
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.