Release v2.1.4: Training improvements and bug fixes · explosion/spaCy

✨ New features and improvements

NEW: util.filter_spans helper to filter duplicates and overlaps from a list of Span objects.
Improve language data for Thai, Japanese, Indonesian and Dutch.
Add --n-save-every to spacy pretrain and rename --nr-iter to --n-iter for consistency.
Add --return-scores flag to spacy evaluate to return a dict.
Add --n-early-stopping option to spacy train to define maximum number of iterations without dev accuracy improvements.

🔴 Bug fixes

Fix issue #3307: Fix symlink creation to show error on Windows.
Fix issue #3473: Fix GPU training for text classification.
Fix issue #3475: Change favicon.
Fix issue #3482: Add Estonian base support to documentation.
Fix issue #3484: Ensure lemmatization is always consistent between sessions.
Fix issue #3521: Add variations of contractions to English stop words.
Fix issue #3523: Make spacy convert correctly default to json.
Fix issue #3525, #3551, #3572: Fix problem that'd cause lemmas to not be lowercase.
Fix issue #3531: Don't make "settings" or "title" required in displaCy data.
Fix issue #3533: Remove non-existent example from docs.
Fix issue #3546: Make sure path in GoldParse.__del__ is a string.
Fix issue #3549: Ensure match pattern error isn't raised on empty errors list.
Fix issue #3561: Fix DependencyParser.predict docs.
Fix issue #3598: Allow jupyter=False to override Jupyter mode in displacy.
Fix issue #3620: Fix bug in .iob converter.
Fix issue #3628: Relax jsonschema pin.
Fix issue #3667: Fix offset bug in loading pre-trained word2vec.
Fix issue #3679: Update glossary to include missing labels in spacy.explain.
Fix issue #3680: Re-add missing universe README.
Fix issue #3681: Rewrite information extraction example to use Doc.retokenize.
Fix issue #3692: Fix return value in Language.update docs.
Fix issue #3694: Make "text" in spacy pretrain optional when "tokens" is provided.
Fix issue #3701: Improve Token.prob and Lexeme.prob docs.
Fix issue #3708: Fix error in regex matcher examples.
Fix issue #3713: Call rmtree and copytree with strings in spacy train.
Fix issue #3720: Add version tag to --base-model argument in spacy train docs.

📖 Documentation and examples

Add free interactive spaCy course.
Fix various typos and inconsistencies.
Add new projects to the spaCy universe.

👥 Contributors

Thanks to @svlandeg, @wannaphongcom, @Bharat123rox, @DuyguA, @SamuelLKane, @graus, @HiromuHota, @jeannefukumaru, @ivigamberdiev, @socool, @yvespeirsman, @lemontheme, @Dobita21, @w4nderlust, @pierremonico, @bryant1410, @celikomer, @xssChauhan, @kowaalczyk, @BreakBB, @fizban99, @tokestermw, @bjascob, @pickfire, @yaph, @amitness, @henry860916, @d5555, @BramVanroy, @F0rge1cE, @richardpaulhudson, @ldorigo, @aaronkub and @devforfu for the pull requests and contributions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.1.4: Training improvements and bug fixes

✨ New features and improvements

🔴 Bug fixes

📖 Documentation and examples

👥 Contributors