- Add the :!OR! operator - now you can specify multiple steps for a stage.
- Written relevant tests for this operator.
- Fix a bug where a status is get-valued if there is a get-valued multi above even if it's not intended to.
- Fix a bug in parser causing an error if there's a new step after a multi-capturing status clause in a rule: When there is a step(keyword) after a :multi, don't match, just set current-step and move on.
- Tests for step after multi
- Add French and English common names dictionary look-up - we patch final Viterbi tagger results with this dictionary to ensure these names are detected.
- Add patching samples in tests file.
- Use tries to efficiently look up dictionaries - may change.
- Moved models out of resourcs into a specific folder, so they are not packaged in the jar - light in clojurescript(in which you'd use the namespaces)
- changed var names in model so they can be used easily in :refer.
- edn... etc in the tools.cljc - now only in clj. (use :refer)
- using :refer in core_tests.cljc
- arg-max was no more referred to in tagger.cljc. Switched to arg-max-m.
- Two cljs friendly models as cljc namespaces: en_fn_model.cljc and fr_tb_model.cljc.
- core_test is now a cljc file.
- Annotated english corpus based on the Framenet Project.
- in tagger.cljc: removed an irrelevant destructuring.
- Detailed workflow and references in Readme.
- :optional-steps are now specified in the rules. No more passing in the parser functions.
- Fixed the Viterbi implemetntion. We don't need T2, we have the associated state as we work with Clojure maps - Simpler implementation and most of all - it works !
- Removed README warning.
- Wrote tests for actual HMM model based on free french treebank. Pass.
- Annotated french corpus based on the Sequoia Corpus from INRIA.
- Annotated french corpus based on the Free French tree Bank
- Changed signature of the viterbi fn.
- First commit: trainer, tagger, parser , rules and tools namespaces.
- Have a sample usage in tests.
- Readme, Changelog, Code of Conduct.