Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Bad words with accented characters not getting detected #3

Open
CookedApps opened this issue Jan 13, 2022 · 1 comment
Open

Bug: Bad words with accented characters not getting detected #3

CookedApps opened this issue Jan 13, 2022 · 1 comment

Comments

@CookedApps
Copy link
Contributor

CookedApps commented Jan 13, 2022

Hey,
I think I found a possible bug: Defining a bad word in a filter list with accented characters, will not filter the word if you write it exactly the same, but only when you normalize the characters first.

Example:

  1. Define the filter with a custom bad word: const filter = new Filter({list: ["wörd"]});
  2. Filtering the bad word will result in a false negative: filter.isUnclean("wörd") = false
  3. Filtering with normalized characters will result in a false positive: filter.isUnclean("word") = true

Expected behavior:

  • filter.isUnclean("wörd") = true
  • filter.isUnclean("word") = false
  • And when defining a bad word without accents:
    • const filter = new Filter({list: ["word"]});
    • filter.isUnclean("word") = true
    • filter.isUnclean("wörd") = true
@3chospirits
Copy link
Owner

This filter is designed for only English. There are very little characters with accents that need to be censored out. In that case, using the non accented version of the filter would make things a lot easier. It's expected that when you load in the words it's already normalized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants