fold-to-ascii-js

A JavaScript port of the Apache Lucene ASCII Folding Filter that converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists.

Documentation

Installation

Package Manager

Npm: npm install fold-to-ascii

Bower: bower install fold-to-ascii

Stand-alone

This version no longer exposes the window.foldToAscii variable in favour of a more modular approach using npm's module.exports. Should you require a stand-alone version, Browserify appears to be the tool of choice (browserify index.js > bundle.js).

Usage

It is simple:

var foldToAscii = require("fold-to-ascii");

// Folding with replacement of unmapped characters with the "_" character:
console.log(foldToAscii.fold("★Lorém ïpsum dölor.", "_"));
// Results in "_Lorem ipsum dolor."

// Folding without replacement of unmapped characters:
console.log(foldToAscii.fold("★Lorém ïpsum dölor.", null));
console.log(foldToAscii.fold("★Lorém ïpsum dölor."));
// Both calls result in "Lorem ipsum dolor."

If no replacement parameter is specified, unmapped characters will be replaced by the empty string.

Tests

All replacement tasks are covered by QUnit tests. Run npm test.

Sources

This is a straightforward port of the very extensive switch/case statement found in http://svn.apache.org/repos/asf/lucene/java/tags/lucene_solr_4_5_1/lucene/analysis/common/src/java/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.java

The function to determine character codes is taken from a code example in the MDN (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/charCodeAt#Example.3A_Fixing_charCodeAt_to_handle_non-Basic-Multilingual-Plane_characters_if_their_presence_earlier_in_the_string_is_unknown).

FAQ

Why is character x being replaced with y and not with z?

The unambiguous allocation of characters to replacements is not possible since it is language-dependent. For example a user from France might expect ü to be replaced with u while a user from Germany expects the replacement to be ue.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
app		app
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bower.json		bower.json
fold-to-ascii.js		fold-to-ascii.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fold-to-ascii-js

Documentation

Installation

Package Manager

Stand-alone

Usage

Tests

Sources

FAQ

Why is character x being replaced with y and not with z?

About

Releases

Packages

Languages

License

yokuze/fold-to-ascii

Folders and files

Latest commit

History

Repository files navigation

fold-to-ascii-js

Documentation

Installation

Package Manager

Stand-alone

Usage

Tests

Sources

FAQ

Why is character x being replaced with y and not with z?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages