You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Roboteus
changed the title
[BUG] [Analyzer] [Token Filters] pattern_capture loses diacritic sign from beginning and end of the word
[BUG] [Analyzer] [Token Filters] pattern_capture loses diacritic sign from beginning and the end of the word
Jan 24, 2025
Describe the bug
This is something specific to OpenSearch because I checked and looks like ElasticSearch has no issues. Unfortunatelly I compared to old version...
Now test results of the analyzer:
Related component
Indexing
To Reproduce
Stated result
Tokens created by pattern_capture are: [ ryj , jótro , kup ]
Looks like words followed or ended by diacritic sign are trimmed by this sign
Expected behavior
[ źryj , jótro , kupę ]
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
Host/Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: