Skip to content

Breaking changes! Import destructuring + 3 letter language codes + lots more

Compare
Choose a tag to compare
@eklem eklem released this 28 Feb 18:40

Breaking changes:

  • Import destructuring (Only ESM can not use the old sw. prefix, CJS can, and UMD will work like before, if you prefer that). If you're using CJS and not defining stopword language (using default english stopword list), you should be fine.
  • 3-letter ISO 639-3 language codes (swapping from ISO 639-1) - This is generally done to have the possibility for more languages, and short term more specifically to fit several sami languages.

Documentation to be almost backwards compatible:

  • What to do to still use ISO-639-1 codes.
  • What to do to still use sw.-prefix for function and variables (arrays of stopwords)

And lots more:

  • 5 languages added (stopword lists): Ukrainian, Lithuanian, Kurdish, Malay and Gujarati (Thansk to stopwords-iso).
  • Using batr for building CJS, ESM and UMD + testing (StandardJS, Playwright, AvaJS and Rollup-stuff in one devDep)
  • UI-tests for demo (testing UMD) + ESM and CJS tests
  • Minified builds and all licenses (stopword + 3rd party) in one file, pointed to from minified. 62 languages in 130 kb
  • Numbers from 0-9 in different scripts moved to it's own "language". Numbers should be handeled by regex, like words-n-numbers can do easily, but we're keeping this as a possibility to also remove numbers 0-9.
  • From TravisCI to GitHub Workflow for CI
  • For testing new languages added, we're using words-n-numbers to extract words (and/or numbers)