Skip to content

v2.0.0 — UTF-8 encoding fix

Compare
Choose a tag to compare
@dchest dchest released this 24 Jan 19:04
· 13 commits to master since this release

After a code re-review I discovered that the internal function that encodes strings to UTF-8 bytes didn't properly encode surrogate pairs, such as emoji, making it incompatible with other implementations that use proper UTF-8 encoding. The function has been fixed.

This change requires a semver-major version, since previously encoded strings that contained surrogate pairs would produce different derived keys than the fixed version. This doesn't apply if you supplied passwords or salts as Array or Uint8Array.

Note that the fixed implementation will raise exception if the source string has incorrect UTF-16 encoding (with incomplete surrogate pairs), since it can't be encoded in UTF-8.

PS This bug highlights the importance of having a single reliable and tested text encoder rather than putting custom encoders into every single package, and I regret including one in this package, especially since it already had a similar bug. The current, fixed implementation has been mostly copied from my highly tested implementation in StableLib. Now that modern browser have TextEncoder and Node.js has Buffer, there's no reason to include UTF-8 coders into every package.