-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Half/full width-insensitive regular expressions #23028
Comments
Your code example: if (/大茅埔段32(7|8|9).地/ ||
/大茅埔段32(7|8|9).地/) {...} Is there a reason to not modify it to be more like this? if ( double_width_to_single_width($_) =~ /大茅埔段32(7|8|9).地/ ) { ... } Your version will also capture double-width 7, 8 and 9, so you'll have to either normalize it after the match or continue doubling your code throughout the rest of your script whenever you use those captured values (say if you look it up in a hash, or pass it to a sub etc) It's worth noting that "width" is not a binary... There are characters who are width-ambiguous and there are those with Neutral width. There's also different names if you were originally half-width. I propose the modifier for a width-insensitive match be both __ |
There are lots more possibilities of wanting to match things than just the width. Making it all hang together cleanly is a huge task. I know of no language that implements things fully; though Raku is much further along than I think any others. With Perl 5, this is accomplished by normalizing both operands before doing operations on them. Unicode::Normalize is furnished to accomplish this. You need a compatibility type normalization. You can also use Unicode::UCD and the NFKC_Casefold property to accomplish this kind of task |
Thanks. I hope all this gets implemented one day. By the way, I think all instruction characters should stick with ASCII. I.e., sorry about your won sign. |
@jidanni well, you can't won them all |
The year was 19xx. The
/i
case insensitivity regexp operator was born.Speed ahead. The year is 2025. More insensitive operators are needed.
E.g., busting down the barriers of
Also, a way to make
/i
break this barrier:https://unix.stackexchange.com/questions/791654/how-to-make-perl-half-full-width-insensitive-regular-expressions
The text was updated successfully, but these errors were encountered: