You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The main work would be to rewrite the transliteration rules for English → * using the IPA characters as source characters. There's 107 characters + diacritics, so this will get really complex. I don't know whether Regexes work well with IPA characters.
The text was updated successfully, but these errors were encountered:
An advantage of using the IPA as source characters would be that it would then no longer be necessary to distinguish between different source languages. And probably, the source part of rules would no longer need to contain multiple characters, and rules would no longer need to be prioritized. (This would bring no performance improvement, however, as the prioritization happens during preprocessing.)
However, this presupposes that there are suitable IPA dictionaries available for all relevant languages (only German, so far). The package mentioned above only includes an IPA dictionary for American English, and they mention in https://github.com/surrsurus/text-to-ipa that it was hard even to find this.
While transliterating letter-by-letter works nicely for German → *, most users appear to find it unintuitive for English → *.
There exists a tool for retrieving the international phonetic alphabet (IPA) version of an English word: https://github.com/shukriadams/node-text-to-ipa
The main work would be to rewrite the transliteration rules for English → * using the IPA characters as source characters. There's 107 characters + diacritics, so this will get really complex. I don't know whether Regexes work well with IPA characters.
The text was updated successfully, but these errors were encountered: