Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal error: ( ̕" ) Any time there's a piece of punctuation/diacritic hanging out on its own not adjacent to a word #185

Open
marctessier opened this issue Aug 1, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@marctessier
Copy link
Collaborator

We noticed a bug where this sequence of characters --> ̕" <- ̕ ( - COMBINING COMMA ABOVE RIGHT (U+315) + " would cause a fatal error.

We had a discussion on slack about it, below are some notes:

Fatal error is from this: Any time there's a piece of punctuation/diacritic hanging out on its own not adjacent to a word, what happens is that it gets tokenized into a word (because it's in the "equiv" mapping), but it doesn't survive the g2p cascade (it's not on its own pronounceable), and then it goes to und, but und can't assign it a pronunciation either. The result is that there's a word in the file with no pronunciation, which we treat as a fatal error.

The better solution would be, whenever the cascade comes up with no pronunciation whatsoever, to say internally "I guess that's not a word after all" and then decide whether to exclude it from the FSG and align anyway (probably the best behavior for punctuation/diacritics), or to warn the user (probably the best behavior for number strings).

In any case, a random apostrophe (or any other piece of punctuation) hanging out somewhere in a file is pretty common, and our response shouldn't be failing to align the whole document. We should probably catch these kinds of errors and just gracefully not align them to anything. It's a different kind of error than (say) leaving an unpronounced number in the text file.

@joanise joanise added the enhancement New feature or request label Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants